This is not a blog about educational policy. However, often findings in my field, educational psychology, have a direct bearing on policy debates. In those cases, particularly when the consequences are great, it would be irresponsible for me not to speak out.

Value added measures of teacher performance are being widely adopted across the country. This adoption is occurring with very little discussion about the validity of these measures. I believe that these measures, at least as conceived today, are invalid.

A measurement can be defined as taking some property in the world and representing it as a number. An invalid measure is one that does not accurately reflect the property it is supposed to represent.

In the past few weeks I have been analyzing data from a research project. The topic is not important for our discussion here, the methodology, however, is. The approach I am using is called a gain score analysis. Participants are assigned to one of two groups, each group will receive a different intervention. For each group we measured our outcome variable at baseline, that is before treatment. After the intervention we will measure our outcome variable again. Gain score is defined as the final measurement minus the baseline measurement. In other word the magnitude of the change. By focusing on the magnitude of the change we don’t have to worry about the fact that the baseline scores were not identical. We use a statistical test to see if one group gained significantly more that the other.

A value added measure of teaching is also a gain score analysis. They measure the students’ performance at the beginning of the year and then measure their performance again at years end. The difference would be the gain score or, as it is called in education, the value added. The average gain score for a group of students is said to be the value added by the teacher.

What is wrong with this approach? After all it seems to be identical to what my colleagues and I are doing in our research. Unfortunately, there is a crucial difference. In my study the participants were randomly assigned to the two groups. **A gain score analysis can not be valid if the group assignments are not random. **

If students are not randomly assigned to schools and classrooms, and, of course they are not, then value added measures are invalid for comparisons between teachers.

We know that students learn at different rates. We know this because in research where teaching is kept constant, such as in programmed instruction, students will complete at different rates. What ever the source of these differences in learning rate it means that a teacher’s value added score will, in part, be a function of student characteristics not under control of the teacher. Thus, any policy based on value added measures is invalid and, by extension, unfair.

I am not opposed to measurement in education. Indeed, I know that properly used measurement can benefit both students and teachers. But to base policy on a measurement that we know to be invalid is senseless.

