How Value-added Teacher Data Is Like a Baseball Batting Average
Neither is consistent from year-to-year, but they're among the best measures we've got for evaluating talent.
In early-September—amidst the hubbub spurred by the Los Angeles Times' release of value-added teacher assessment data—a report from the Economics Policy Institute warned that it would be "unwise" to use data pertaining to students' performance on standardized tests in making personnel decisions at a school. A new report out of the Brookings Institute says it would be unwise not to use the data at all.
The researchers behind the Brookings paper make an interesting case, drawing parallels between selective colleges' use of SAT scores in admissions, despite the fact that they don't have a strong correlation with freshman-year GPAs (about 35 percent). In the medical space, the patient mortality rates for various surgeries are published annually for hospitals and their surgeons, yet the rates aren't consistent from year-to-year more than 50 percent of the time. And, in Major League Baseball, how well a hitter bats in one year is only roughly 36 percent predictive of what he'll hit the following year.
As Jay Mathews, The Washington Post's venerable education reporter says, it's equivalent to asking the question: "Should the San Francisco Giants keep rookie of the year Buster Posey [pictured above] on their team next year?" (Answer: Of course, they should, he hit .305 last year!)
Interestingly, this 0.35 range of correlation is also where the year-to-year value-added scores for teachers lie. What that means is that if in one year a certain group of teachers falls in the top 25 percent in terms of improving their students' test scores, there's only a 35 percent chance they'll remain in that cohort the following year. This is a major criticism thrown out by detractors of value-added data—and one deftly deflected by the Brookings Institute paper.
Brookings is supportive of value-added methodology, but not full-throated in its use of anything beyond teacher evaluation for the purpose of feedback, though they strongly stress that the use of value-added data should be approached from the perspective of what's best for the student (not the teacher). If a few average teachers are mistakenly rated as poor, and then subsequently dismissed, that's preferable to ineffective teachers being characterized as adequate.
When teacher evaluation that incorporates value-added data is compared against an abstract ideal, it can easily be found wanting in that it provides only a fuzzy signal. But when it is compared to performance assessment in other fields or to evaluations of teachers based on other sources of information, it looks respectable and appears to provide the best signal we’ve got.\n