March 10, 2005

Statistics Term of the Day: Scales of Measurement

Today we'll cover something simple: Scales of measurement. They're not tricky, but they're important, especially when it comes to deciding what inferential statistics can be used, and what conclusions can be made.

And we'll mix some catblogging in here as well.

First, there's the nominal, or categorial, scale. This really isn't a quantitative scale at all, but a qualitative grouping. A survey item that asks a community, "What different kinds of cats do you own?" is nominal; the responses might be "tabby," "Siamese," "Maine Coon," etc. The correct descriptives here are counts and the mode; when we get to inferential statistics, you'll learn about non-parametric analyses such as chi square that are suitable for categorical data (for example, a chi square test could help you answer the question, "Is type of cat owned independent of college major?").

Next, there's the ordinal scales. Think "order" when you hear ordinal, because that's what this scale preserves. Class rank, movie ratings, the "AmIHotOrNot" ten-point attractiveness scale - all of these group observations and preserve the order of observations (a perfect 10 is cuter than a 6, a movie that gets 5 stars is better than one with 3 stars), but you don't know how much cuter, or better, observations with the higher values are. With ordinal scales, the mode and median are useful, along with the inter-quartile range.

In the photo below (taken tonight at the intake center), the kitties are in position 1 (top), 2 (middle), and 3 (bottom), but just knowing their value on an ordinal scale doesn't tell you how far apart they are:

ordinalkitties.jpg

Next on the list there's the interval scale. This scale groups observations, preserves the order, and tells you how far apart each observation is. Each point on an interval scale represents the same magnitude on the trait being measured, no matter where on the scale you are. The classic example of a true interval scale is temperature in Fahrenheit. When it's 30 degrees out, it's 10 degrees warmer than when it's 20 degrees; when it's 80, it's 5 degrees cooler than when it's 85.

However, there isn't an absolute zero on the Fahrenheit scale, which is what keeps it from being the next level of measurement - ratio. Ratio scales have an true zero point, so not only does one unit's difference mean the same thing across the scale, but you can also say that 4 units on a ratio scale is twice as high as 2 units. 30 degrees F is not twice as hot as 15 degrees F, but 300 degrees Kelvin IS twice as hot as 150 degrees Kelvin , because the Kelvin scale starts from absolute zero.

Many measurements made from direct observation, or in the hard sciences, are on the ratio scale. Time in finishing a race is ratio - the person who finished in 20 minutes took half as long as the person finishing in 40 minutes. If I have 100 bucks in my pocket and you have 150, I have 50 dollars less than you, and the person with $200 has twice as much as me (and let me tell you, I know all about the true zero point when it comes to income).

"Number of stripes on this kitty" is on the ratio scale. If he has 25 stripes, he has half as many stripes as another cat with 50:

ratiokitty.jpg

Note that each scale builds on the one before, and has the qualities of all previous scales. Ratio scales allow you to group observations, rank order, add or subtract scale values, and multiply and divide scale values.

Psychological and educational measurements - such as IQ, SAT scores, or personality measures - are not ratio, and not really interval, although they are often treated as such. The issue rests on whether you can say that measures of latent traits really have equal intervals across the scale. Someone with an IQ score of 180 is smarter than a person with a 150 (so the order is preserved), but does that difference of 30 points mean the same thing when we're talking about two people who have scores of 120 and 90, respectively? What about an anxiety scale? Does a 5-point difference at the bottom mean the same thing as a five-point difference at the top?

If we have these concerns, why do we often assume psychological/educational scales are interval? Basically, we do it so that we can use descriptive statistics like the mean and standard deviation, and use powerful parametric statistics to make inferences. However, we take our chances in doing this; the appropriateness of our analyses rest on the assumptions that we make about the underlying scale, and the more incorrect we are in our assumptions, the less confidence we'll have that our analyses - and our inferences - are correct.

Posted by kswygert at March 10, 2005 07:18 PM
Sitemeter