Education Experts Caution Against Reliance On Test Scores in Teacher Evaluations

(Reposted by permission. For complete version see Education Policy Institute.)
Contact: Phoebe Silag or Karen Conner, (202) 775-8810

Student test scores are not reliable indicators of teacher effectiveness, even with the addition of value-added modeling (VAM), according to a new Economic Policy Institute report. Though VAM methods have allowed for more sophisticated comparisons of teachers than were possible in the past, they are still inaccurate, so test scores should not dominate the information used by school officials in making high-stakes decisions about the evaluation, discipline and compensation of teachers.

Among the 10 co-authors of the report, Problems with the Use of Student Test Scores to Evaluate Teachers, is Helen F. Ladd, Edgar T. Thompson Distinguished Professor at the Sanford School of Public Policy and president-elect of the Association for Public Policy Analysis and Management.

The Obama administration has encouraged states to adopt laws that use student test scores as a significant component in evaluating teachers, and a number of states have done so. The Los Angeles Times recently used value-added methods to evaluate teachers in the Los Angeles Unified School District based on the test scores of their students, and Secretary of Education Arne Duncan supported the paper’s decision to publicly release this information, asserting that parents have a right to know how effective their teachers are.

The conclusions of the EPI report suggest that the Times’ analysis, which attempts to analyze teacher effectiveness, is unreliable and inaccurate. The co-authors make clear that accuracy and reliability of analyses of student test scores, even in their most sophisticated form, are highly problematic.

Analyses of VAM results show that they are often unstable across time, classes and tests.  Thus, test scores, even with the addition of VAM, are not accurate indicators of teacher effectiveness. Student test scores cannot fully account for the wide range of factors that influence student learning, particularly the backgrounds of students, school supports and the effects of summer learning loss. As a result, teachers who teach students with the greatest educational needs appear to be less effective than they are. Furthermore, VAM does not take into account nonrandom sorting of teachers to students across schools and students to teachers within schools.

The authors point to other negative consequences of using test scores to evaluate teacher performance: Teachers have an incentive to “teach to the test;” incentives to collaborate within schools are reduced; and teacher morale can suffer.

The authors conclude that that, “Although standardized test scores of students are one piece of information that school leaders may use to make judgments about teacher effectiveness, test scores should be only a small part of an overall comprehensive evaluation.”

The report's co-authors are: Eva L. Baker, Paul E. Barton, Linda Darling-Hammond, Edward Haertel, Helen F. Ladd, Robert L. Linn, Diane Ravitch, Richard Rothstein, Richard J. Shavelson and Lorrie A. Shepard.