Interpretation Problems of Age and Grade Equivalents
Always Learning
International Sites

  Change Community

Interpretation Problems of Age and Grade Equivalents

Because of the inherent psychometric problems associated with age and grade equivalents that seriously limit their reliability and validity, these scores should not be used for making diagnostic or placement decisions (Bracken, 1988; Reynolds, 1981).

The reliability of age- and grade-equivalent scores is limited by the relationship between the equivalents and the raw scores on which they are based. An age or grade equivalent is simply the median raw score for a particular age or grade level. Because the acquisition of skills measured by an instrument such as a vocabulary test occurs more rapidly during early ages, raw scores increase at a greater rate with younger examinees than with older examinees. Therefore, a similar change in raw scores of younger examinees and of older examinees will be represented quite differently in age equivalent scores.

For example, the age equivalent for a raw score of 50 on the PPVT-III is 4 years 0 months. The age equivalent for a raw score of 55 is 4 years 4 months. A change of 5 raw score points at this early age reflects a change of 4 months in terms of "age-equivalent" scores. However, the age equivalent for a raw score of 165 is 16 years 4 months, and for a raw score of 170 is 18 years 2 months. At the later age, a raw score change of 5 points results in almost 2 years' difference in terms of "age-equivalent" scores.

An age-equivalent score for a young examinee that reflects a 6-month delay actually may represent a greater difference in test performance than a 6-month delay in an older examinee. A greater difference in raw score points (and skills being measured) is represented by the younger examinee's 6-month delay. The older examinee is at a higher developmental level and is acquiring skills at a slower pace as he or she approaches competency. The 6-month delay may be caused by a difference of only 1 or 2 raw score points. Smaller changes in raw scores at these upper ages reflect larger and larger changes in age-equivalent scores as the ceiling of the assessment is reached.

Therefore, the reliability for age-equivalent scores is much poorer for advanced test-takers (McCauley & Swisher, 1984). This is why so many assessments do not report age or grade equivalents beyond a specified age or grade level. For example, in the OWLS Written Expression Scale, age-equivalent scores are not reported after age 12 and grade-equivalent scores are reported only up to grade 6. The acquisition of writing skills occurs most rapidly during the early years because writing mechanics are taught in the primary grades. The degree of discrimination among examinees with advanced writing skills is demonstrated by smaller changes in score points.

Standard scores are a more accurate representation of an examinee's ability because they are based not only on the mean at a given age level but also on the distribution of scores.

Standard scores also can be arithmetically compared and summarized. Age and grade equivalents are not a ratio or interval scale of measurement. They cannot be added, subtracted, or averaged.


Bracken, B.A. (1988). Ten psychometric reasons why similar tests produce dissimilar results. Journal of Psychology, 26, 155-166.

McCauley, R.J., & Swisher, L. (1996). Use and misuse of norm-referenced tests in clinical assessment: A hypothetical case. Journal of Speech and Hearing Disorders, 49, 338-348.

Plante, E. (1998). Criteria for SLI: The Stark and Tallal Legacy and Beyond. Journal of Speech, Language, and Hearing Research, 41, 951-957.

Reynolds, C.R. (1981). The fallacy of "two years below grade level for age" as a diagnostic for reading disorders. Journal of School Psychology, 19 (4), 350-358.