Site hosted by Angelfire.com: Build your free website today!

History
Up History Key Terms Concepts Links Bibliography

 

This page explains the history of the IQ test, specifically the Stanford-Binet IQ test and its antecedents.  Information on the sources I used for the information presented here can be found in my bibliography.

Historical Background

The Early IQ Test

The Modern IQ Test

Controversy

Historical Background

Francis Galton The history of modern intelligence testing begins with Francis Galton (1822 – 1911), an amateur English psychologist of the Victorian era.  Galton was convinced of the importance of heredity to intelligence, basing many of his ideas off of the theories of his cousin, Charles Darwin, author of The Origin of Species.  He believed that the individual differences described by Darwin in his theory of evolution applied not only to physical traits, but also a person’s mental abilities.  The main method used by Galton to test the inheritance of intelligence, however, was a person’s social prominence.  Galton also used a series of tests of such attributes as human reaction time, identification of differences in pitch and weight, and other sensory data in an effort to measure intelligence, calling his experiments anthropometrics.  Few psychologists actually thought of themselves as his followers, and he founded no school of psychology.  He was, however, a genius in his own way, being the first to apply the bell curve to human ability (especially intelligence), inventing the concept of linear regression, and inventing the first self-questionnaire, which he gave to two hundred members of the Royal Society.  He also coined the term “nature versus nurture” (in reference to intelligence) and virtually founded the theory of eugenics.  The impact of Galton's theories and methods can still be felt in the field of psychological testing today.

McKeen CattellGalton’s only major follower was McKeen Cattell (1860 – 1944), who continued work on Galton’s so-called anthropometrics.  Cattell was professor of psychology at the University of Pennsylvania and later Columbia University.  He continued to use Galton’s anthropometric methods, but eventually was forced to conclude (using Galton’s own linear regression principles) that the tests had no real correlation to intelligence.  Later, in 1921, he founded a nonprofit institution called the Psychological Corporation to publish psychological tests.  The company remains a major publisher to this day, but has since become a private organization.

 The Early IQ Test

Alfred Binet             The most famous person in the history of intelligence testing was Alfred Binet (1857 – 1911), who invented the precursor to the modern intelligence test.  Binet lacked a formal education in psychology, but along with Théodore Simon he invented a series of tests designed to pick out mentally retarded children from their fellows so that they could be given special classes.  Binet and Simon invented these tests in 1905 while part of a commission to help solve problems in the school system of Paris, where education for children had recently been made mandatory.  The minister for education had specifically assembled the commission to determine which students were retarded and would need special schooling.  Binet believed that intelligence was expressed through judgment, attention, and reasoning skills, and he constructed his test based on these abilities.  The first tests were simply arranged in ascending orders of difficulty, but the 1908 revision of the test was categorized by age levels, where two – thirds to three – quarters of the students of the age category could complete the tests of that category, and a smaller percentage of younger students and a larger percentage of older students.  In this way he invented the concept of mental age.  Unfortunately, the original sample group was quite small.  Binet revised his tests again in 1911.  In 1912 a German psychologist named William Stern suggested that instead of simply using a child’s mental age to determine if they were retarded, the child’s mental age should be compared to his actual age, creating the concept of an intelligence quotient, or IQ.

Charles SpearmenAt the same time that Binet was developing his first scale, an English Psychologist, Charles Spearmen (1863 – 1945) developed the concept of general intelligence, or the g factor.  Spearmen theorized that all intelligence related to this one general factor plus a number of specific factors.  When a large number of diverse tests are given to a representative sample, Spearmen noted that in well documented cases these tests have a positive correlation, that is, doing well on one test means the individual is more likely to do well on another, unrelated test.  Developing factor analysis to analyze his tests, Spearmen determined that if diverse enough tests were given, the specific intelligences related to each tests would tend to cancel out and reveal the general intelligence of the individual.  Spearmen’s theory of general intelligence heavily influenced the Binet scale, which focuses on finding a single measurement of intelligence (in this case called IQ).

Henry Goddard            Intelligence testing became most popular in the United States, where its leading proponent was Henry Goddard (1865 – 1957).  Goddard was the first to apply Binet’s test on a mass scale in the US, testing well over two thousand students.  Goddard was also an avid believer in hereditary intelligence and eugenics, and he actively campaigned for legislation at the state and federal levels to make eugenics a part of law.  His testimony, along with that of other psychologists, led to laws requiring sterilization of “defectives” in 27 states by 1931.  Congress also passed immigration laws to stop idiots and lunatics from entering America, and Goddard was asked to work at with immigration control to improve their testing methods.  Goddard worked at Ellis Island for over a year from 1913-1914, and subsequently deportations based on “idiocy” rose 500 percent.  His research, conducted on illiterates who did not speak English and were usually frightened and physically exhausted when the tests were administered, concluded that many immigrants from southern and eastern Europe were retarded.  These results influenced Congress in 1924 to raise immigration quotas for these areas.

Lewis TermanLewis M. Terman (1877 – 1956), professor of psychology at Stanford University, noted that there were many problems with Binet’s tests, and sought to correct them.  His revised test, the Stanford-Binet scale, was published in 1916.  Terman changed the age level of some of the tests to more appropriately match the performance of test takers since Binet’s last revision, and added more tests to the upper and lower age limits of the scale, so that it could be used to more effectively test younger and older individuals.  Most importantly, he standardized the tests to a greater degree than any previous versions, using a much larger sample and creating well-defined categories based on the numerical results of the tests.

            Psychological testing also played a part in America’s military.  A group of scientists, including Terman and Goddard and led by Robert Yerkes, created two sets of tests to administer to military personnel to weed out the mentally incompetent and help select officers.  The two sets of tests, the Army Alpha and the Army Beta (designed for illiterates), were the first group tests to be used and were administered to over a million men before the war ended.  Group tests similar to the Army Alpha were given to millions of American schoolchildren over the course of the next decade. 

David Wechsler            The Stanford-Binet scale was revised in 1937, and this time it had a standardization sample of over 3000 people, the largest of any individual test before that time.  The new revision markedly improved correlation to previous tests but produced unstable scores for younger individuals and those with high IQs.  Also, the IQ scores of different ages did not correlate to each other well, as each age group had different standard deviations.  Two years later David Wechsler published the Wechsler-Bellevue Intelligence scale (W-B), which provided several scores relating to different aspects of a person’s intelligence, instead of just one.  The most innovative of these was the performance test, which did not make use of verbal responses.  The Wechsler IQ tests have since become some of the most popular options for IQ testing, partly because there are three different tests designed for different age ranges (young children 4 - 6 ½ years, children 6 - 16 years, and adults).

 The Modern IQ Test

 The 1960 revision of the Stanford–Binet scale fixed many of the technical problems of the 1937 version, including the problems with standard deviation.  Deviation IQ was introduced, which simply took each age level individually and set the average score to 100 and made the standard deviation the same for each group.  Thus, each individual age level was taken into consideration, and the IQs of individuals of different ages could be compared.  No test sample was used for this revision, as most of the test itself was the same as the 1937 scale, with most of the changes being introduced in the method of scoring and test arrangement.  In 1972, however, a new test sample of 2100 children was given the 1960 revised test, and for the first time non-whites were included in the sample.

In 1986 the Stanford-Binet test received its last revision to date.  This revision included some of the most drastic changes in the history of the Stanford-Binet test.  Some of the ideas of Wechsler were incorporated, most significantly the inclusion of performance tests.  The test itself was divided into fifteen separate sections, testing different aspects of intelligence, which then were arranged into three group factors: crystallized abilities (divided into verbal reasoning and quantitative reasoning sections) that reflected learning, fluid analytic abilities (determined with abstract or visual reasoning tests) that measured potential to learn, and short term memory.  These three groups factors were then related to measure general intelligence (g).  Each of the individual parts of the test had their own age scales, and each test was designed to be adaptive to the individual, determining the upper and lower limit of potential on each test.  This made the test very complicated to administer, but inter-test correlation was very high.  The 1986 revision of the Stanford-Binet used a test sample of over 5000 students from across the United States, and factors such as region, community size, ethnicity, age, and gender were all taken into account.  The newest test also correlates well with other modern tests.

Controversy

Intelligence tests, indeed, all kinds of psychological testing, have been criticized since their inception.  This has prompted the many revisions over the years, resulting in fairer and better tests.  Validity and reliability are two of the most important aspects of any psychological test, and the Stanford-Binet is no exception.  Critics, whether professional psychologists trained in the used of statistics or not, are constantly examining the IQ test's results and the test itself to find problems and fix them, leading to constant suggestions (or demands) for refinements.

Binet’s original test, and many of its successors, was criticized for being too language dependent.  It was dependent not only on the language and fluency of the test taker, but also somewhat dependent on experience and background as opposed to native ability.  The early American tests also used extremely specialized information, such as famous individuals in history, scientific information, and knowledge that was common to American culture.  This made the tests extremely difficult for immigrants and ethnic groups.  The test samples were also based on a norm of middle class white Americans, making the test difficult for the poor and for ethnic groups.  The original Stanford-Binet test, for example, was based entirely on native white Californians, which did not even represent a standard sample of the Caucasian population of America.  Criticism about these issues led to revisions not only in the test itself, but also in the samples used.

One of the essential points of contention in the controversy over IQ testing is the definition of intelligence.  Many of the historical individuals addressed previously had their own definitions of intelligence, and no two definitions were the same.  The situation remains the same today, with modern theories of intelligence taking into account criticisms of past ones.  The definition of intelligence also varies from culture to culture, with different ethnic and social groups placing emphasis on different aspects of intelligence.  The definition is important to testing because whatever a particular test maker defines to be intelligence will determine how the test they make will be structured and exactly what it will test and how it will be used.

The theory of the g factor has also been criticized harshly.  Wechsler was in part motivated to make his intelligence scale for the purpose of subdividing intelligence, and modern psychologists have suggested various breakdowns of intelligence into component parts (the most famous being Gardner’s multiple intelligences and Sternberg’s triarchic theory of intelligence).  Some critics feel that the g factor only measures the kind of intelligence used in academic settings, not an individual’s creative, artistic, or other types of intelligence.  However, most psychologists still believe in a general intelligence, although the idea of subdivisions is also widely popular.  A person may have different levels of ability in different areas of intelligence, but people that are more intelligent tend to be more intelligent in all areas, and vice versa.  This is why the newest edition of the Stanford-Binet test uses multiple tests to measure multiple aspect of intelligence, but still has a single score reflecting general intelligence.

Intelligence tests have been criticized almost from the start for the disparity of test scores between different ethnic groups.  Goddard’s testing of immigrants at Ellis Island and the results of the Army Alpha and Beta tests indicated that southern and eastern Europeans, as well as African Americans and other minorities, were inferior or deficient.  When Carl Brigham wrote A Study of American Intelligence in 1923 in support of these findings, criticism of all forms of intelligence testing increased.  Looking back on the era, many modern critics believe that these tests and the conditions they were administered under were extremely biased and unfair, and possibly led to an increase in racist attitudes in the United States.

The Bell CurveIn 1969 Arthur Jensen published an extremely controversial article supporting inherited racial differences in intelligence.  More recently, in 1994, Herrnstein and Murray published The Bell Curve, in which they examined socio-economic factors related to IQ.  They suggested that IQ is a critical factor in determining success in different aspects of life and suggested potential social policies based on their theories.  Their book remains controversial, and not all people agree that their methods were correct or that the research they used was accurate and up to date.  While the IQ controversy about ethnicity and class continues, it has been shown that different racial groups have significantly different average IQ scores, and that IQ seems to increase with the level of education of the test-takers parents.  However, many critics feel that this only shows an unfair bias in the IQ test.

Though it may be controversial, the Stanford-Binet test in its modern form has weeded out many of the problems of the previous versions and has been shown statistically to more than adequately reliable.  As intelligence theory has grown, so has the nature of the IQ test, and the current tests are backed by years of refinement and accumulated data.  The Stanford-Binet test is not perfect and will no doubt continue to be refined, but it is one of the most accurate psychological tests in existence today.

 

For more information online about the people who have shaped both the IQ test and intelligence testing in general, got to History of Influences in the Development of Intelligence Theory & Testing.  That is also the site where I found all of the pictures, except the book cover for The Bell Curve, which I found on Amazon.com

 

Created by Chris Riedel, January 2002

Designed for Mrs. Hannah's AP Psychology Class, third period, at Thomas Jefferson High School for Science and Technology