|
|
|
This page explains the history of the IQ test, specifically the Stanford-Binet IQ test and its antecedents. Information on the sources I used for the information presented here can be found in my bibliography.
Psychological testing also played a part in America’s military. A group of scientists, including Terman and Goddard and led by Robert Yerkes, created two sets of tests to administer to military personnel to weed out the mentally incompetent and help select officers. The two sets of tests, the Army Alpha and the Army Beta (designed for illiterates), were the first group tests to be used and were administered to over a million men before the war ended. Group tests similar to the Army Alpha were given to millions of American schoolchildren over the course of the next decade.
The Stanford-Binet scale was revised in 1937, and this time it had a
standardization sample of over 3000 people, the largest of any individual
test before that time. The new
revision markedly improved correlation to previous tests but produced unstable
scores for younger individuals and those with high IQs.
Also, the IQ scores of different ages did not correlate to each other
well, as each age group had different standard
deviations. Two years later
David Wechsler published the Wechsler-Bellevue Intelligence scale (W-B), which
provided several scores relating to different aspects of a person’s
intelligence, instead of just one. The
most innovative of these was the performance test,
which did not make use of verbal responses.
The Wechsler IQ tests have since become some of the most popular options for IQ
testing, partly because there are three different tests designed for different
age ranges (young children 4 - 6 ½ years, children 6 -
16 years, and adults).
The 1960 revision of the Stanford–Binet scale fixed many of the technical problems of the 1937 version, including the problems with standard deviation. Deviation IQ was introduced, which simply took each age level individually and set the average score to 100 and made the standard deviation the same for each group. Thus, each individual age level was taken into consideration, and the IQs of individuals of different ages could be compared. No test sample was used for this revision, as most of the test itself was the same as the 1937 scale, with most of the changes being introduced in the method of scoring and test arrangement. In 1972, however, a new test sample of 2100 children was given the 1960 revised test, and for the first time non-whites were included in the sample. In 1986 the Stanford-Binet test received its last revision to date. This revision included some of the most drastic changes in the history of the Stanford-Binet test. Some of the ideas of Wechsler were incorporated, most significantly the inclusion of performance tests. The test itself was divided into fifteen separate sections, testing different aspects of intelligence, which then were arranged into three group factors: crystallized abilities (divided into verbal reasoning and quantitative reasoning sections) that reflected learning, fluid analytic abilities (determined with abstract or visual reasoning tests) that measured potential to learn, and short term memory. These three groups factors were then related to measure general intelligence (g). Each of the individual parts of the test had their own age scales, and each test was designed to be adaptive to the individual, determining the upper and lower limit of potential on each test. This made the test very complicated to administer, but inter-test correlation was very high. The 1986 revision of the Stanford-Binet used a test sample of over 5000 students from across the United States, and factors such as region, community size, ethnicity, age, and gender were all taken into account. The newest test also correlates well with other modern tests. Intelligence tests, indeed, all kinds of psychological testing, have been criticized since their inception. This has prompted the many revisions over the years, resulting in fairer and better tests. Validity and reliability are two of the most important aspects of any psychological test, and the Stanford-Binet is no exception. Critics, whether professional psychologists trained in the used of statistics or not, are constantly examining the IQ test's results and the test itself to find problems and fix them, leading to constant suggestions (or demands) for refinements. Binet’s original test, and many of its successors, was criticized for being too language dependent. It was dependent not only on the language and fluency of the test taker, but also somewhat dependent on experience and background as opposed to native ability. The early American tests also used extremely specialized information, such as famous individuals in history, scientific information, and knowledge that was common to American culture. This made the tests extremely difficult for immigrants and ethnic groups. The test samples were also based on a norm of middle class white Americans, making the test difficult for the poor and for ethnic groups. The original Stanford-Binet test, for example, was based entirely on native white Californians, which did not even represent a standard sample of the Caucasian population of America. Criticism about these issues led to revisions not only in the test itself, but also in the samples used. One of the essential points of contention in the controversy over IQ testing is the definition of intelligence. Many of the historical individuals addressed previously had their own definitions of intelligence, and no two definitions were the same. The situation remains the same today, with modern theories of intelligence taking into account criticisms of past ones. The definition of intelligence also varies from culture to culture, with different ethnic and social groups placing emphasis on different aspects of intelligence. The definition is important to testing because whatever a particular test maker defines to be intelligence will determine how the test they make will be structured and exactly what it will test and how it will be used. The theory of the g factor has also been criticized harshly. Wechsler was in part motivated to make his intelligence scale for the purpose of subdividing intelligence, and modern psychologists have suggested various breakdowns of intelligence into component parts (the most famous being Gardner’s multiple intelligences and Sternberg’s triarchic theory of intelligence). Some critics feel that the g factor only measures the kind of intelligence used in academic settings, not an individual’s creative, artistic, or other types of intelligence. However, most psychologists still believe in a general intelligence, although the idea of subdivisions is also widely popular. A person may have different levels of ability in different areas of intelligence, but people that are more intelligent tend to be more intelligent in all areas, and vice versa. This is why the newest edition of the Stanford-Binet test uses multiple tests to measure multiple aspect of intelligence, but still has a single score reflecting general intelligence. Intelligence tests have been criticized almost from the start for the disparity of test scores between different ethnic groups. Goddard’s testing of immigrants at Ellis Island and the results of the Army Alpha and Beta tests indicated that southern and eastern Europeans, as well as African Americans and other minorities, were inferior or deficient. When Carl Brigham wrote A Study of American Intelligence in 1923 in support of these findings, criticism of all forms of intelligence testing increased. Looking back on the era, many modern critics believe that these tests and the conditions they were administered under were extremely biased and unfair, and possibly led to an increase in racist attitudes in the United States.
Though it may be controversial, the Stanford-Binet test in its modern form has weeded out many of the problems of the previous versions and has been shown statistically to more than adequately reliable. As intelligence theory has grown, so has the nature of the IQ test, and the current tests are backed by years of refinement and accumulated data. The Stanford-Binet test is not perfect and will no doubt continue to be refined, but it is one of the most accurate psychological tests in existence today.
For more information online about the people who have shaped both the IQ test and intelligence testing in general, got to History of Influences in the Development of Intelligence Theory & Testing. That is also the site where I found all of the pictures, except the book cover for The Bell Curve, which I found on Amazon.com |
|
Created by Chris Riedel, January 2002 Designed for Mrs. Hannah's AP Psychology Class, third period, at Thomas Jefferson High School for Science and Technology |