摘要
In his introduction, Weir states, core of this book is concerned with exploring a framework for establishing validity of interpretation of scores on tests produced by Exam Boards or by teachers for use their classrooms. The evidence-based approach subtitle alludes to view that process of validating use of scores on any given test is much like a courtroom trial − requiring evidence to support arguments for or against a favorable verdict that test was fair. In first of book's four parts, Weir introduces five ways to collect evidence needed to make a convincing case; first two are collected a priori, or design phase of a test, and consist of defining what abilities test is supposed to measure as well as how sample of tasks test represents abilities 'the real world' (outside test itself) that test users are looking for. The remaining three types of evidence are empirical, collected a posteriori as statistical procedures to estimate or enhance reliability of test scores, studies of how well test scores correlate with external criteria such as other tests purported to measure same abilities or actual performance real world, and lastly, study of backwash or social consequences of test use for stakeholders: teachers, students, parents, administrators, and the marketplace. Part two begins a detailed survey of frameworks for tests of reading, listening, speaking and writing. Each framework is presented as a flow-chart of boxes which detail a priori considerations − test taker characteristics, test characteristics, theories of internal processes and resources − to a posteriori considerations for investigating scoring characteristics, criterion-related evidence of score value and impact of score interpretation. Six chapters present examples from actual research to illustrate how evidence of validity was obtained in action. Part three could serve as a syllabus for a practicum on test validation methodology. It contains pointers for sound research procedures, checklists and questionnaires to help researchers collect each of five types of evidence for valid test score use. This part alone would justify buying book, as it can certainly encourage teachers to not only have more confidence their own assessment practices, but would probably result some pretty interesting presentations or publications. Part four of book is entitled Further resources language testing, and is actually an up-to-date and comprehensive list of textbooks, journals, professional organizations, professional conferences, e-mail lists, bulletin boards and websites, databases and statistical packages − a list probably equal to a few years of word-ofmouth searches for answers to questions that plague anyone who gets involved with testing and assessment. In conclusion, this book is highly recommended. It is most readable and comprehensive treatment of validation that I have ever come across. It is as free of technical jargon as can be; important points are presented clearly through Weir's choice to enclose poignant quotes and concepts boxes within text which saves on magic markers or underlining, addition to other features of this book that have already been reported. Its 284 pages contain not a single mathematical equation or algorithm, so it's not a cookbook for number crunching, but it'll tell you where to go for that if you need it − plus much, much more.