Fact Sheet: Multiple Measures: A Definition and Examples from the U.S. and Other Nations

Multiple Measures: A Definition and Examples from the U.S. and Other Nations

Summary

Definition. Multiple measures: the use of multiple indicators and sources of evidence of student learning, of varying kinds, gathered at multiple points in time, within and across subject areas. In response to concerns about No Child Left Behind’s narrowing of curricula, some states have begun to use techniques they falsely label “multiple measures.” Unfortunately, these are usually just multiple uses of the same statewide, standardized test results, not authentic multiple measures.

Examples of real multiple measures abound, including science labs or field work, from short tasks to extended projects; oral presentations in any subject; extended math problems that require application to real world uses; reading aloud and conversing with the teacher about a book; in-depth history reports, presented orally, in an essay, a PowerPoint, etc.; writing a paper in a second language; art or music projects; and answering questions from an expert panel about a project the student has done, much as doctoral candidates defend their theses. Documentation of teacher observations or interactions with the teacher can be useful, particularly with young children, if well structured. Many of these can be done individually or in groups (so long as the purpose is clear). This material can be organized so that it can be re-scored by other, independent educators, to ensure the accuracy of the classroom teacher, a process known as “moderation.”

Examples of multiple measures systems used successfully in the U.S. – Learning Record: Developed for use with multi-lingual, multi-cultural populations, to assess progress in reading, writing, speaking and listening. Using a structured format, the teacher regularly observes and describes the student and her work, and attaches samples, to provide multiples sources of evidence. Student progress is summarized in writing and placed numerically on a developmental scale. LRs have been re-scored with high inter-rater agreement, and studies have supported its validity.

Examples from other nations. Most other nations, including many with better outcomes on various indicators, test less than the U.S. They use a mix of state/national and local assessments, including performance tasks, primarily for public information and improvement efforts, not accountability.

Conclusion: Multiple measures, extensive use of performance assessments and the inclusion of local evidence are feasible in large-scale assessment systems. Through reviews of such systems, using auditing (independent reviews of the assessment system) and moderation, both reliability and comparability can be established.

Attachment Size
multiple_measures_summary_v2_11-30-11(2)-1.pdf 141.05 KB