October 15, 2003

Answering Mathews' "Bias Question"

John Rosenberg bopped me on the head this afternoon with a reminder that the "Bias!" crowd is still at it. The latest installment? An article by Jay Mathews in The Atlantic online entitled,"The Bias Question", in which Mr. Mathews swallows the former ETS employee Roy Freedle's suggested SAT revisions hook, line, and sinker.

There's a lot of interesting backstory here, mostly about Dr. Freedle's research experience at ETS and his interest in linguistics, which formed the basis for his research and the Harvard Educational Review article, "Correcting the SAT's Ethnic and Social-Class Bias: A Method for Reestimating SAT Scores."

John dispatched this article rather quickly, by noting that Mathews had concluded with the following fallacy:

[According to test critics] The problem [with minority score gaps], in short, is with the test rather than with any reality the test measures.

Now, is it my imagination, or does John seem weary of the repetitive nature of these sorts of test "criticism?" Perhaps I only hear that weariness because I, too, am weary of having to state the same arguments over and over again, while the testing critics shut their eyes and ears. My weariness means that I simply don't want to do a long, in-depth criticism of the Mathews article. Much of what I would have to say, I said before when I criticized the original Freedle coverage in the Chronicle, and when I criticized the bias allegations by another researcher in a more recent Chronicle article.

I suggest you go read the Mathews article for yourself, but there are a few remarks by Mathews that I simply have to address:

Freedle's accusation of racial bias in the SAT is striking because it is one of the few ever to come from an experienced ETS professional...

Striking, perhaps, but why assume that because it is one of the few, it must be right? Or even that it should be given more than a passing glance? I sense here an assumption (which I see from many testing critics) that ETS is an evil place that deliberately manipulates tests so that minorities fail, and no one there is allowed to contradict the "party line." Therefore, if anyone who used to work at ETS makes a statement that contradicts claims by ETS or the College Board, then that person must be right....why? Because what ETS/CB claims cannot be true?

I mean, imagine this statement:

Dr. Smith's insistence that the germ theory of disease is incorrect is striking because it is one of the few ever to come from an experienced medical professional...

Would you conclude just from this statement that Dr. Smith must have stumbled upon something that has escaped all other doctors up until now? Or would you conclude that he's an eccentric and perhaps confused researcher determined to push an agenda that contradicts the established research?

You can take "eccentric" for what you will, but unless the assumption is that the medical community (or ETS) has conspired to "hide the truth", there's no reason to conclude that one researcher from "the inside" who insists that all the established research is wrong, and that we must now start doing everything a different way, is automatically believable.

Perhaps more important, it has caught the attention of the University of California (a powerful malcontent in the College Board family), which has ordered its own detailed analysis of the issue, due to be completed in 2004. Even if Freedle is ultimately proved wrong, his success at raising doubts about the SAT shows how loose a grip the test has on the political and scientific handholds that keep it upright.

That's funny. I would assume that if Freedle's research was proven to be wrong, that would be one more link in the chain of evidence that the SAT can be a useful tool for predicting first-year college performance for all test-takers. Again, the assumption here seems to be that ETS and the College Board support the SAT only through a conspiracy of hidden research, and the fact that this one person came to light must be proof that the whole house of cards is about to fall down.

I've previously addressed Freedle's suggestion that we tinker with the test solely to decrease the test-score gap without decreasing the education gap, and the College Board has addressed the fallacies of his research. But I still can't let this one portion of Freedle's research, as described by Mathews, go by:

Common words, Freedle explained, "often have many more semantic (dictionary) senses than rare words," so there's more of a chance that people's cultural and socio-economic backgrounds will affect their interpretations of those words. (In a 1990 study Freedle and Kostin reported that "fifteen high-frequency analogy words ... had an average of 5.2 dictionary entries, whereas rare analogy words ... had an average of only 2.0 dictionary entries.") Thus words that are frequently used in the middle-class neighborhoods of the SAT makers may have a different meaning in underprivileged minority neighborhoods. This, Freedle continued, could help explain why African-American students do worse on questions containing those common words than on questions that depend on the harder (but less ambiguous) words they study at school. He found that this effect was most pronounced on those questions—sentence completions, analogies—that provided little or no context.

This is Freedle's argument for why African-American test-takers should be helped on the exam by a revised scoring system which scores only the hard items, and not the easier ones. Leaving out the very large question of why items should be given at all if they're not going to be scored, I have to address the "educational relativism" that's holding up this theory.

So, high-frequency words are more likely to be used in different ways, by different people, in different environments. Fine. However, the dictionary lists word usage in specific orders for a reason, and just because a word can be used in different ways, it doesn't necessarily follow that students cannot be expected to know when a particular usage is correct in a particular situation.

When Mathews, and Freedle, say that "words...used in the middle-class neighborhoods of the SAT makers may have a different meaning in underprivileged minority neighborhoods," what they're both desperately trying to avoid saying is that minority students are getting these meanings wrong on tests. They're trying to avoid admitting that in the context of a test item, just as in the context of a portion of writing, words have right and wrong meanings, and there's no reason to give a pass to a student who doesn't learn how to use words in context, and no reason to give their schools an out for not teaching them that.

For example, let's use the noun "deck". The online Merriam-Webster gives the following definitions:

1 : a platform in a ship serving usually as a structural element and forming the floor for its compartments
2 : something resembling the deck of a ship: as a : a story or tier of a building b : the roadway of a bridge c : a flat floored roofless area adjoining a house d : the lid of the compartment at the rear of the body of an automobile; also : the compartment e : a layer of clouds
3 a : a pack of playing cards b : a packet of narcotics
4 : TAPE DECK
- on deck 1 : ready for duty 2 : next in line : next in turn

So, I say "deck," you might think of the deck behind your house, being on deck for the next task - or a packet of cards, or narcotics. But a test might contain this word in an analogy item, such as the one I just made up and don't claim is perfect:

"Deck is to ship as foundation is to (a) house (b) sky (c) bird (d) cow"

The deck of a ship is essentially the structural element which serves as the floor, so it's analogous to the foundation of a house (Like I said, this isn't perfect. Don't write me emails telling me everything that's wrong with this item).

This item has little context, but the only thing required of the student is that, at some point, the student has learned the primary meaning of the word. Nothing in Freedle's research convinces me that we cannot expect schools to teach all students the multiple meanings of words, and therefore should be willing to change the test scoring to make life easier for students who have not learned how to use words in correct ways.

Isn't Freedle's argument essentially that African-American and white students cannot be expected to learn the same definitions for a set of words within an educational environment, and therefore we (as testmakers) cannot insist that students understand all uses of a word and in which contexts they should be used? If our educational system is short-changing African-American students to the point that they never learn how to use common words in all the ways that they're used in proper written English, the test-score gap is one of the few meaningful indicators of this. Why aren't African-American students learning the same meanings of words as white students, and why should we modify the test to make accommodations for this, rather than demand that schools do their jobs and give all kids as extensive an education in reading comprehension and the English language as possible?

The fundamentally dishonest idea of closing the test-score gap at the end (test results) rather than at the beginning (the teaching of students) pervades this article, and Mathews' conclusion is breathtaking in its wrong-headedness:

Nearly all those involved—ETS and College Board officials, University of California researchers, high school guidance counselors and admissions officers from those schools that would be affected by a change in the SAT—are, like Freedle, practical people with a seemingly distant but still compelling goal. They want to remove barriers that limit young people's choices in life. All of them, Freedle included, acknowledge that many other things, more difficult than devising a scoring supplement to a multiple-choice test, will have to be done to make that happen.

I've emphasized the above because I've said this so many times before, and even though I'm sure I'll be saying it again soon, I want it to be clear. Tests are not barriers. Tests measure skills, and if students do not gain those skills, they are by definition barred from achievement whether the tests are in place or not.

If the items are re-weighted and re-scored so that the score gaps closes, this hides the fact that African-American students are actually learning less in school. Freedle's suggestions do not ensure that everyone will go forth (with their "different" understandings of the English language) with the same chance of succeeding in school and in life simply because the "barrier" of the test has been removed.

The barrier is bad education, not the tests that measure educational achievement. Why do reasonable, logical writers like Mathews miss this point time and time again?

Posted by kswygert at October 15, 2003 03:43 PM
Sitemeter