What's going on with the Oregon state exam?
Has Oregon been dumbing down its tests? That's what Greg Perry of Brainstorm Magazine thinks. He wanted to see if Oregon's taxpaying parents were getting their money's worth from the expensive, tailor-made Oregon State Assessment Tests (OSAT). He's the father of a fifth-grader at Franklin School, which is notable because it is one of the only schools in Oregon that has student data for both the OSAT and the Terra Nova for 3rd, 5th and 8th graders. The Oregon test is normed to Oregon students, but the Terra Nova is nationally-normed.
He's also an economist, so he likes to focus on empirical data. Here's what he found:
...our students' scores on Oregon's statewide assessment had a much different pattern [than the Terra Nova scores]. There was a sharp rise in scores starting in the late 90's that was not evident in our Terra Nova scores. What was going on? If scores on one test were going up quickly while scores on another did not, there had to be a reason...Was it because achievement levels were actually rising? Or had Oregon's tests gotten easier?...
Obviously, this is not a trivial question. If true, it would reflect pretty poorly on the state's claim that Vera Katz's School Reform Act was working. The Oregon Department of Education website tells us that achievement is way up: far more kids are meeting the reading and math testing standards than in the early nineties...I graphed the average state scores in reading and mathematics in third, fifth and eighth grade from 1991 to 2001. Eighth grade scores trended upward in a modest zigzag pattern, similar to the pattern exhibited by Oregon average scores on the SAT. Nothing out of the ordinary there.
Then a strange pattern emerged. The scores for third and fifth grades varied up and down from 1991-95, with a small upward trend. But beginning in 1996, the scores jumped up every single year, at an accelerating pace. An odd result, not consistent with the zigzag pattern one would expect...Had the tests gotten easier?
Mr. Terry was not successful in obtaining the past test items, so he went back to the data he already had in his possession:
Since we had several years of data in which students took both tests, if the OSAT was getting progressively easier, the inflated scores would be revealed by a multiple regression analysis. After pooling the data for both the Terra Nova and the OSAT for the years 1996-2001, with special variables representing each year, I ran the regressions. If there was any inflation in the scores, these variables would be both positive and statistically significant....
At the third grade level, the OSAT was apparently inflated by five points from 1997 to 2001. At the fifth grade level, the scores were inflated by almost 15 points from 1996 to 2001. To put these results in perspective, note that the OSAT mathematics average score statewide increased by seven points from 1996 to 2001. If the Franklin results are an accurate prediction of the results statewide, it suggests that the state scores actually declined by eight points from 1996 to 2001.
What might explain this result? One possible explanation is that teachers are teaching to the test, in this case the OSAT. However, the two tests track each other so closely that any effort by teachers to boost OSAT scores would almost certainly boost the Terra Nova scores. Another explanation is that the assessment experts at ODE have done a poor job of equating tests from year to year. This might explain the results for the reading assessment, because the differences do vary from year to year. However, for the mathematics assessment, the gap between the OSAT and Terra Nova widens each year for third and fifth grade students. This kind of trend seems deliberate, not random. Another explanation is that the results may be simply a fluke, a product of a very small data set in a special situation..
To me, the most plausible explanation is that Oregon's math assessment has simply gotten easier.
Mr. Terry then goes on to check the Oregon test scores against the SAT and NAEP scores. He discovered that, despite the gains on the in-state exam, the math gains on NAEP and the SAT actually lag behind the national average. However, state exams and the SAT aren't necessarily intended to measure the same thing.
I'm impressed by his willingness to examine the data, something few testing critics (and even fewer parents) are willing to do (of course, Oregon didn't make it easy for him to obtain these data). The score jumps in and of themselves are surprising, especially when they aren't accompanied by similar jumps in Terra Nova scores. The Terra Nova exam is consistent from year to year, and the OSAT obviously is not. However, I'm not as willing as Mr. Terry to assume that the apparent difference in test forms is deliberate. Test equating is not an easy process, and it's possible that a change in mathematical test content could have resulted in a change in the difficulty of the exam, despite attempts to equate the forms. However, if the content was changed, the parents should have been notified.
I'm also impressed by a sidebar questioning the value of the OSAT, especially its subjectivity:
Subjectivity: In a quest for the holy grail of assessment--measuring nebulous skills such as "critical thinking" and "problem solving,"--Oregon has developed a myriad of assessments such as writing sample tests, math problem-solving tests, and portfolio assessments. These faddish assessments are costly, cumbersome, lack reliability and validity, and take too much time away from instruction.
This is, in a nutshell, what I have always said in response to education reformers who unthinkingly recommend performance assessments as a useful alternative to standardized multiple-choice exams. This is not to say that such assessments cannot be useful - just that they are often more expensive, time-consuming, and unreliable than testing critics would have you believe. They are not a simple alternative to tests which seem too "sterile", or which don't seem to measure the "higher-order" skills.
Thanks to Joanne Jacobs for catching this first.