Intelligence Testing

Lina Lopez Chiappone. Encyclopedia of Clothing and Fashion. Editor: Valerie Steele. Volume 2. Detroit: Charles Scribner’s Sons, 2005.

Some scholars maintain that the notion of intelligence is a social construct invented by the privileged classes. Others maintain that intelligence reflects a biological reality and is a useful tool in life. However, how to best measure this ability remains a topic of debate, particularly as it affects schoolchildren. Despite divergent opinions on intelligence, its definition, and how to best measure it, intelligence testing traditionally has been one of the primary tools for identifying mental retardation, learning disabilities, and giftedness in American schools. This entry examines the history of intelligence, particularly as it pertains to education, and briefly describes current directions.

Defining Intelligence

The nature of the human intellect has fascinated scholars for centuries. Plato, Aristotle, and other ancient Greeks discussed intelligence and tried to define it. In 1575, Spanish physician Juan Huarte defined intelligence as the ability to learn, exercise judgment, and be imaginative. In 1839, American physician Samuel George Morton championed craniometry, the measurement of skull and brain size to determine the intelligence of different races. Following this line of thinking, phrenology, which uses the shape of and bumps on the head to gauge mental faculties, was also used to determine intelligence. This early work, among other attempts, forms the foundation for modern explorations of intelligence over the past 200 years. However, a single definition of intelligence has eluded scholarly consensus.

In the eighteenth and nineteenth centuries, psychology began to emerge as a discipline separate from philosophy, mathematics, and biology. However, individuals from these diverse fields continued to influence psychological discourse and the study of intelligence. In 1859, Charles Darwin published On the Origin of Species. The book suggested that the capabilities of humans, like those of lower animals, could be understood through scientific investigation. Darwin’s cousin Sir Francis Galton, strongly influenced by this line of thinking, had a laboratory in London from 1884–1890 where visitors could have themselves measured on a variety of psychophysical tasks he believed were the basis of intelligence and therefore measured intelligence. Some of the tests included deciding which of two weights was heavier and squeezing a hand as forcefully as one could. Even in light of these crude attempts at measurement, Galton is regarded as the father of the intelligence testing movement, originating two very important statistical concepts, regression to the mean and correlation, which helped in the development of psychometrics.

The Galtonian tradition was taken to the United States by psychologist James McKeen Cattell, who was head of the psychological lab at Columbia University. Cattell had studied with Wilhelm Wundt at Leipzig, where the first psychological laboratory was founded in 1879. In 1890, Cattell proposed that psychology could not attain the certainty and exactness of the physical sciences unless it rested on a foundation of experiment and measurement. He proposed fifty psychophysical tests very similar to Galton’s. These included identification of colors and reaction time for sound. Cattell’s measures, like Galton’s, had little predictive value for intellectual functioning. Ironically, one of his students, Clark Wissler, collected data showing that scores on Galtonian tasks were not good predictors of grades in college, or much else. Cattell continued his work in psychometric research and, with Edward L. Thorndike, developed the principal facility in the United States for mental testing and measurement.

The more influential tradition of mental testing was developed by French psychologist Alfred Binet and his collaborator Theodore Simon in France in 1905 to help identify learning-impaired students. Even before Wissler’s research, Binet had rejected the Galtonian tradition, believing Galton’s tests measured trivial abilities. He proposed that intelligence tests should measure judgment, comprehension, and reasoning—the same kinds of skills measured on most intelligence tests today. Although crude by today’s standards, the Binet-Simon intelligence scale yielded results generally accorded with common understandings of high and low intelligence and is considered to be the first modern intelligence test.

The Arrival of IQ

Binet’s early test was taken to the United States by Stanford University psychologist Lewis Terman, who revised and standardized the test in 1916. In Terman’s Stanford-Binet Intelligence Scale, scoring was indicated by the intelligence quotient (IQ), a term proposed by German psychologist William Stern in 1912. IQ was calculated by dividing mental age (as determined by the test) by chronological age and multiplying by 100 to remove the decimal. Modern intelligence tests, including the current Stanford-Binet test, no longer compute scores using the IQ formula. Instead, intelligence tests give a score that reflects how far the person’s performance deviates from the average performance of others who are the same age, arbitrarily defined as an average score of 100. By convention, many people still use the term IQ to refer to a score on an intelligence test.

In the early 1920s, as IQ testing was gaining momentum, so was the eugenics movement. Sterilization laws were passed in sixteen American states between 1907 and 1917, with the elimination of mental retardation as its goal. U.S. Supreme Court justice Oliver Wendell Holmes, Jr., in an opinion upholding the constitutionality of the law, said, “Three generations of imbeciles are enough.” In 1912, Henry Goddard administered intelligence tests to immigrants at Ellis Island. A biologist was appointed to a House Committee on immigration to oversee keeping American intelligence high by implementing suitable immigration policies based on the findings.

During World War I, a group of American psychologists led by Robert Yerkes developed intelligence tests for the Army. The Army Alpha exam for literate recruits and the Army Beta exam for illiterate or non-English-speaking ones were used to sort men for service. The examination of nearly 2 million recruits marked the world’s first mass administration of an intelligence test. The exams have been criticized for lacking standardization and for assigning recruits to the wrong exam. Despite these shortcomings, group intelligence tests gained in popularity and credibility. Many of the standardized tests in use today can be traced back to the Army tests, including the Scholastic Aptitude Test (SAT) and the Wechsler Intelligence Scale for Children (WISC—IV), the fourth revision of David Wechsler’s classic 1949 test and one of the most widely used psychometric tools.

Other views on intelligence have also influenced how the construct is measured. Factor analytic theories of intelligence hold that intelligent behavior is multidimensional, rather than a unitary trait. Thorndike, Thurstone, and others have made significant contributions to the field. For example, in 1904, British psychologist Charles Spearman found that people who did well on one mental test did well on others, regardless of their content. He reasoned that different tests must draw on the same global capacity and called that capacity g, for general intelligence, but it also depended on one or more specific factors (called s) affecting performance.

The American obsession with testing continued to grow and took center stage in the 1950s when the launching of Sputnik brought into question U.S. school curriculum, consequently leading to more testing for accountability to prevent further “failure.” In 1969, educational psychologist Arthur Jensen from UC Berkeley wrote that programs like Head Start fail because many of the participants have fixed, low IQs. These comments, viewed to have racial overtones, started a new wave of controversy. Even today, controversy about intelligence testing related to IQ variations between groups based on race and socioeconomic status continues. Most scholars agree that there are IQ differences between these groups. However, the reasons for these differences ignite debate in both academic and public arenas.

Controversies and Current Directions

In the 1981 bestseller The Mismeasure of Man, biologist Stephen Jay Gould was highly critical of “unscientific” intelligence tests that are often used to find that oppressed groups are inferior and deserve their status. Continuing the whirlwind of debate, in 1994, The Bell Curve, by Richard Herrnstein and Charles Murray, stated that there are substantial individual and group differences in intelligence that are not subject to easy environmental control because they are inherited, genetic differences. Herrnstein and Murray set out to prove that American society was becoming increasingly merito-cratic, as wealth and other positive social outcomes were being distributed more according to people’s intelligence and less according to their social backgrounds.

In 1995, the American Psychological Association published a report on the status of intelligence research, in part to address the controversy surrounding The Bell Curve. Findings included that IQ scores do have predictive validity for individual difference in school achievement, as well as for adult occupational status. Furthermore, individual (but specifically not population) differences in intelligence are substantially influenced by genetics. The report also stated that the large differences existing between the IQ scores of Blacks and Whites could not be attributed to biases in test construction and suggested explanations based on social status and cultural differences, although it acknowledged no empirical evidence to support this idea. There was also not much direct evidence to support a genetic component to racial differences in scores.

Intelligence testing has been accused of unfairly stratifying or tracking students and adults according to race, gender, class, and culture by not tapping into creativity and practical know-how. As a result of this criticism, new tests and revised editions of long-standing tests have modified and updated the content, format, and the interpretation of data to reflect a diverse school-age demographic that includes different cultural and linguistic backgrounds, and to reflect new notions of intelligence based on modern theories of brain function. Yale psychologist Robert Sternberg has developed a triarchy of intelligence containing analytic, practical, and creative components. Harvard’s Howard Gardner believes that intelligence includes traditional components such as problem solving. However, he denies the notion of g. Instead, he argues for eight distinct intelligences (linguistic, musical, logical/mathematical, spatial, bodily/kinesthetic, intrapersonal, interpersonal, and naturalist).

Although Gardner and Sternberg provide a fresh and interesting lens on intelligence, their ideas have not translated easily into practical ways of measuring them in classrooms. However, Sternberg and collaborators are working with the College Board, which administers the SAT, to devise a test to supplement the traditional test for college hopefuls. Scores from the Sternberg Triarchic Abilities Test are expected to better predict college success and increase equity among ethnic groups in the admissions process.

Despite these changes, psychologists tend to agree that a single score on any one measure of intelligence is unrealistic when determining appropriate placement and services for schoolchildren. The trend among school psychologists is to collect data on children’s abilities, as well as weaknesses, using a variety of assessment instruments, in addition to gathering anecdotal information from the school and home settings, and using all of the data to devise a plan for early intervention, taking into account the whole child, not just a score on a test.