July 16, 2003

So you want to be an item writer

Are you out of work? Perhaps the standardized testing industry is for you. No shortage of jobs here, and one employment area that has exploded is in item writing. Educational qualifications vary, you can do it part-time, and the work is well-paid and (I think) very interesting. As the NYT notes today, though, it's not as easy a task as it looks:

Writing standardized tests is like edging through a minefield of psychometric pitfalls and politically correct second-guessers, and it takes meticulous care to make sure that questions do not confuse students or bring misleading scores. These are cautionary lessons as the test publishing industry gears up to produce new exams on an industrial scale, the result of a federal law that requires the greatest expansion of standardized testing in American history...

That should mean a lot of work for those who specialize in writing test questions, often called "items"...

The word "items" seems so intuitive to me that I'm surprised the NYT puts in in quotes. Or does it only seem obvious to me because I've been studying educational testing for 11 years? Perhaps I've forgotten how odd the word sounds to others...

...no amount of wizardry can create a good test out of poorly written items, just as no chef can create a tasty meal from rotten food. And quality has emerged as a problem as the country's testing appetite has grown ravenous.

In May, the National Board on Educational Testing and Public Policy, a group affiliated with Boston College, issued a report documenting 50 high-profile testing mistakes that had occurred in 20 states from 1999 through 2002.

My comments on the report can be found here. I thought the situation was a bit exaggerated, but I agree in general that good items are now more necessary than ever, and some of the mistakes listed in the report could, I'm sure, be traced to poor item specifications or poor item bank assembly.

One hurdle is the bias and sensitivity review, in which representatives of various groups — women, blacks, Muslims, people with disabilities, others — critique the questions.

...author Diane Ravitch described how reviewers at Riverside Publishing deleted from a national assessment test a question that mentioned Mount Rushmore because they considered the monument upsetting to Indians, and rejected an essay on peanuts because some students might be allergic to them. Dr. Ravitch said the bias reviewers exercise a "regime of censorship."

But others defend the system. [Item writer] Ms. Oberley said the reviewers did point out legitimate problems. An example, she said, was a question she wrote to measure kindergarten students' comprehension of the word "driveway." It included sketches of a driveway leading to a suburban garage, of cars on an urban boulevard, and of others on a freeway.

"We have many gravel roads and few paved highways," an American Indian reviewer wrote. "Our children may think these are all driveways."

Ah, the issue of sanitizing test items. I'm not at all surprised that the NYT found someone willing to defend it, although the example given is pretty tame, and falls more into the category of legitimate cultural bias, rather than the victimology or historical revisionism issues that Ms. Ravitch condemns. Making allowances for Native American kids who might not be used to seeing paved roads is different from validating any dislike those same kids have for Mount Rushmore by removing a test item about it.

Posted by kswygert at July 16, 2003 05:37 PM
Sitemeter