Focus on misdiagnoses of special education disabilities has centered predominately on the overrepresentation of Black males in mental retardation classrooms. While it is unknown how many misdiagnoses occur each year, they have been systematically and historically documented across all ethnic minority populations (e.g., Hispanic, Native American, and Asian American) and special education categories (Reschley, 1981). Indeed, ethnic dispro-portionalities have been reported in specific learning disability (Barona, Santos de Barona, & Faykus, 1993; Payette & Clarizio, 1994) and in emotionally disturbed (Harris-Murri, King, & Rostenberg, 2006; Yeh, Forness, Ho, McCabe, & Hough, 2004), whereas in gifted and talented classrooms, ethnic minorities and females are under-represented and have a lower retention rate than their white, male counterparts (Moore, Ford, & Milner, 2005).

Based on these findings, researchers in psychology and education have emphasized the importance of cultural sensitivity and awareness in professionals who refer, test, and diagnose disabled children, especially minorities. In particular, the ways in which standardized tests are used and interpreted have been criticized given the weight an IQ score has on every special education diagnoses (e.g., Reschley, 1981). Because of an IQ phenomenon known as the Flynn effect, however, each one of the over 2 million children who are evaluated every year on IQ tests as part of their special education evaluation is at risk for misdiagnosis, regardless of ethnicity and gender.


IQ tests are used in all special education diagnosis decisions, but they play a fundamental role in mental retardation (MR) and specific learning disability (SLD) diagnoses. Under federal guidelines, a child who displays “significantly subaverage general intellectual functioning, with deficits in adaptive behavior, and manifested during the developmental period (prior to age 18), that adversely affects a child's educational performance” will receive MR services (Assistance to States for the Education of Children with Disabilities, 2006, p. 46756). While each state has the flexibility to develop its own criteria (as long as it does not conflict with the criteria set forth by IDEA), “significantly subaverage general intellectual functioning” is usually determined by an IQ score of 70 points or below, as recommended by the American Psychiatric Association (APA, 1994) and the American Association of Intellectual and Developmental Disabilities (formerly known as the American Association of Mental Retardation; AAMR 1992). Often, obtaining an IQ score below the cut-off of 70 points is enough to meet the criteria for mental retardation or MR.

Under the same guidelines, SLD is defined as “a disorder in one or more of the basic psychological processes involved in understanding or in using language, spoken or written, that may manifest itself in an imperfect ability to listen, think, speak, read, write, spell, or to do mathematical calculations” (Assistance to States for the Education of Children with Disabilities, 2006, p. 46757). There is much variability, however, between states on the ways in which SLD is diagnosed. A commonly used method is to determine if a student's IQ is significantly higher than his or her achievement test score, although this methodology is highly criticized among researchers and practitioners (e.g., Reschley & Hosp, 2004) and is no longer required in the Code of Federal Regulations. Regardless, a child must display at least average intellectual functioning (usually defined as an IQ of 85 points or higher) in order to qualify for SLD services. Thus, many educational diagnoses require scoring above or below a specific IQ score cut-off to qualify.

SLD is an umbrella term for a diverse array of cognitive disabilities that can range from brain injury to dyslexia. Therefore, services can range from providing students with extra time on tests to hiring resource room specialists. While the latter services can be costly, they are implemented less frequently than the former services, which are substantially less expensive. Therefore, while SLD is the most common special education diagnosis, accounting for almost half (48.5%) of the five million children in special education in 2002 (U.S. Department of Education, 2004), the costs of providing SLD services can vary widely.

MR, by contrast, is far less frequently diagnosed, accounting for less than 10% of the special education population (U.S. Department of Education, 2004). It is, however, a more consistent and costly set of services, compared to SLD. Common services offered to MR students include modified regular assignments and more extensive educational interventions, such as receiving one-to-one instructions in a self-contained classroom, instruction from specialists, and trained aides (Singer, Butler, Palfrey, & Walker, 1986).


The Flynn effect refers to the steady rise in IQ scores seen between 1945 and the early 2000s. It is named after James Flynn (1984, 1987), the political scientist credited with its most extensive documentation. Flynn and others (e.g., Te Nijenhuis & Van der Flier, 2007; Rodgers & Wanstrom, 2007) have observed this systematic IQ rise in 29 nations, spanning 5 continents, including both developed and developing nations. The estimated magnitude of these gains range between 5 to 25 points within a single generation, depending on the country and type of IQ test used. In the United States, this rise is estimated to be approximately 3 points a decade on the Wechsler Intelligence Scale for Children (WISC) and the Stan-ford-Binet measures.

The reason for the wide range in gains between (as well as within) countries can be attributed to the IQ test that is used. Gains are higher on measures that rely heavily on tests of fluid abilities, or on-the-spot reasoning and abstract abilities, than they are on tests of crystallized abilities, or accumulated knowledge. Tests of crystallized intelligence, such as the vocabulary and arithmetic subtests on the WISC, show very small gains (only 2 to 3 points) between 1947 and 2002. Whereas subtests tests that measure abstract reasoning, visual cognition, on-the-spot reasoning, and working memory show gains up of 25 points during the same time period (Flynn, 2007).

The Flynn effect is measured in terms of IQ gains that result from people doing significantly better on an old IQ test that was created as far back as 25 years before their time (and has not been changed in that interval) when compared to their performance on a more current test whose content has been updated. In other words, as IQ test norms get older, people perform better on them, raising the mean IQ by several points within a matter of years. Specifically, within 20 years, the average IQ in the U.S. population rose from 100 to 106. To compensate for this upward creep, IQ tests are re-normed, at which point the test is made harder. Re-normed tests reset the mean back to 100, which in turn, hides the previous gains from the old norm. Therefore, the same person will score higher on an outdated, old IQ test compared to a brand new IQ test, even if both tests are administered on the same day.


Given the fluctuations in IQ created by the Flynn effect when old IQ tests are re-normed, coupled with the heavy use of IQ cut-off scores, the diagnoses of MR and SLD are problematic. While scores rise and thus as more students surpass the 70 IQ cut-off, fewer students are classified MR. Similarly, more students are classified SLD as more students surpass the 85 IQ cut-off. Thus, the discrepancy between IQ and achievement widens. Both of these trends occur regardless of the students' actual cognitive ability. When scores fall with the onset of new IQ norms, these diagnostic trends will reverse; suddenly, MR diagnoses will increase, while SLD diagnoses will decrease. Again, these changes will occur not because the child's cognitive ability has changed but because the child was tested on a new IQ norm.

While Flynn's seminal analyses excluded individuals with cognitive disabilities, several studies have directly explored the Flynn effect among children in special education. Many researchers have reported that children receiving special education services lose, on average, some 5 to 9 IQ points when tested on a newly normed IQ test (e.g., Wechsler, 1991). More alarmingly, Kanaya, Scul-lin, and Ceci (2003), using a nationally diverse sample, found that over 30% of the students in their sample who tested above the MR cut-off score of 70 points on the WISC-R (Wechsler 1974) tested below the cut-off when retested on the newer normed WISC-III (Wechsler, 1991). This, in turn, resulted in a threefold increase in MR diagnoses, simply due to the test norms used and the year tested even though the students' actual cognitive ability did not decline.

In addition, among children initially diagnosed with SLD on the WISC-R, Gaskill and Brantley (1996) found that more than 40% of the students in their sample no longer met the criteria for SLD when retested on the newer, harder WISC-III. This finding was replicated by Truscott and Frank (2001) who found that the number of SLD diagnoses in a school district decreased with the introduction of the new norm as students no longer had a significant discrepancy between their IQ scores and achievement test scores.

Therefore, MR and SLD students were measured as having a significant drop in IQ when retested on a newer norm, and this drop often leads to a change in their diagnosis. More in-depth analyses also suggest there are individual differences in the Flynn effect. For example, Kanaya, Ceci, and Scullin (2005) discovered a larger Flynn effect among younger children tested on the original WISC norms (Wechsler, 1949), compared to older children on subsequent WISC-R norms. In addition, findings by Sanborn, Truscott, Phelps, and McDougal (2003) suggest that the Flynn effect is diminished in SLD children at lower IQ levels when compared to SLD children at higher IQ levels.


Due to the Flynn effect and the use of IQ cut-off scores in special education policies, children are diagnosed based on the year tested and test norms used rather than their actual cognitive ability. Therefore, children who are tested on old IQ norms and misdiagnosed due to their inflated scores will not qualify for or receive the resources of the appropriate education that is guaranteed to them by law. After the initial diagnosis, in order to accommodate changes that may occur in children's needs over time, federal guidelines require periodic re-evaluations for all children in special education. An IQ test is often administered as part of this re-evaluation to determine if the current educational program is meeting the children's needs and if services should be modified or discontinued. At this time, children run the risk of receiving a diagnosis based on a norm that has continued to age since their initial diagnosis, thus obtaining a further inflated score. Therefore, the services provided and the financial costs of providing them to disabled children can vary widely throughout the course of those children's education.

Consequently, the services and resources provided to misdiagnosed children will change independent of their actual cognitive ability and educational needs. In other words, schools will misallocate their financial resources and the students may not receive the services they need. Such findings and implications call into question the use of IQ cut-off scores in special education diagnoses and suggest that researchers, teachers, administrators, and policy makers must recognize the role of the Flynn effect when determining the educational needs of children with disabilities.


American Association of Mental Retardation. (2002). Mental retardation: Definition, classification, and systems of supports (10th ed.). Annapolis, MD: Author.

American Psychiatric Association. (1994). Diagnostic and Statistic Manual (4th ed.). Washington, DC: Author.

Assistance to States for the Education of Children with Disabilities. (2006). 71 Fed. Reg. 46753.

Barona, A., Santos de Barona, M., & Faykus, S. P. (1993). The simultaneous effects of sociocultural variables and WISC-R factors on MR, LD and non placement of ethnic minorities in special education. Education and Training in Mental Retardation, 28, 66–74.

Flynn, J. R. (1984). The mean IQ of Americans: Massive gains 1932 to 1978. Psychological Bulletin, 95, 29–51.

Flynn, J. R. (1987). Massive IQ gains in 14 nations: What IQ tests really measure. Psychological Bulletin, 101, 171–191.

Flynn, J. R. (2007). What is intelligence? New York: Cambridge University Press.

Gaskill III, F. W., & Brantley, J. C. (1996). Changes in ability and achievement scores over time: Implications for children classified as learning disabled. Journal of Psychoeducational Assessment, 14, 220–228.

Harris-Murri, N., King, K., & Rostenberg, D. (2006). Reducing disproportionate minority representation in special education programs for students with emotional disturbances: Toward a culturally responsive response to intervention model. Education and Treatment of Children, 29, 779–799.

Kanaya, T., Ceci, S. J., & Scullin, M. H. (2005). Age differences in secular IQ trends: An individual growth modeling approach. Intelligence, 33, 613–621.

Kanaya, T, Scullin, M. H., & Ceci, S. J. (2003). The Flynn effect and U.S. policies: The impact of rising IQ scores on American society via mental retardation diagnoses. American Psychologist, 58, 1–13.

Moore, III, J. L., Ford, D. Y., & Milner, H. R. (2005). Recruitment is not enough: Retaining African American students in gifted education. Gifted Children Quarterly, 49, 51–67.

Payette, K. A., & Clarizio, H. F. (1994). Discrepant team decisions: The effects of race, gender, achievement, and IQ on LD eligibility. Psychology in the Schools, 31, 40–48.

Reschley, D. J. (1981). Psychological testing in educational classification and placement. American Psychologist, 36, 1094–1102.

Reschly, D. J., & Hosp, J. L. (2004). State SLD identification policies and practices. Learning Disability Quarterly, 27, 197–213.

Rogers, J. L., & Wanstrom, L. (2007). Identification of a Flynn effect in the NLSY: Moving from the center to the boundaries. Intelligence, 35, 187–196.

Sanborn, K. J., Truscott, S. D., Phelps, L., & McDougal, J. L. (2003). Does the Flynn effect differ by IQ level in samples of students classified as learning disabled? Journal of Psychoeducational Assessment, 21, 145–159.

Singer, J. D., Butler, J. A., Palfrey, J. S., & Walker, D. K. (1986). Characteristics of special education placements: Findings from probability samples in five metropolitan school districts. Journal of Special Education, 20, 319–337.

Te Nijenhuis, J., & Van der Flier, H. (2007). The secular rise in IQs in the Netherlands: Is the Flynn effect on g ? Personality and Individual Differences, 43, 1259–1265.

Truscott, S. D., & Frank, A. J. (2001). Does the Flynn effect affect IQ scores of students classified as SLD? Journal of School Psychology, 39, 319–334.

U.S. Department of Education (2004). Twenty-sixth annual report to Congress on the implementation of the Individuals with Disabilities Education Act. Washington, DC: Author.

Wechsler, D. (1949). The Wechsler Intelligence Scale for Children Manual. New York: Psychological Corporation.

Wechsler, D. (1974). The Wechsler Intelligence Scale for Children (Rev. manual). New York: Psychological Corporation.

Wechsler, D. (1991). The Wechsler Intelligence Scale for Children Manual (3rd ed.). New York: Psychological Corporation.

Yeh, M., Forness, S. R., Ho, J., McCabe, K., & Hough, R. L. (2004). Parental etiological explanations and disproportionate racial/ethnic representation in special education services for youths with emotional disturbance. Behavioral Disorders, 29, 348–358.