“INTELLIGENCE. See MENTAL TESTS.” So learns the historian who consults the authoritative fifteen-volume Encyclopaedia of the Social Sciences (1930–35) to discover how American social scientists had come to think about intelligence in the early decades of the twentieth century. The new prevalence of statistics and the appetite for statistical community were nowhere more vividly expressed than in the twentieth-century American elaboration of techniques for “measuring” intelligence.
As early as the 1870’s, the English geneticist Francis Galton, preoccupied with human differences, had urged the new study of “anthropometry” and the making of anthropometric records. In 1882 he opened a laboratory at the South Kensington Museum in London, where, for a fee, anyone could have certain physical measurements made, including tests of keenness of vision and hearing, and reaction time. The term “mental tests” was probably first used in 1890 by the American psychologist James McKeen Cattell to describe tests on one hundred University of Pennsylvania freshmen. These were tests on keenness of eyesight and of hearing, reaction time, after-images, color vision, perception of pitch and of weights, sensitivity to pain, perception of time, accuracy of movement, rate of perception and of movement, memory, and imagery.
Then, in 1891, Franz Boas, a young anthropologist recently arrived from Germany, under prodding from G. Stanley Hall, the psychologist-president of Clark University who was anxious to establish psychology as a useful science, took measurements of fifteen hundred Worcester schoolchildren, including tests of their memory. In the crowds at the World’s Columbian Exposition in Chicago in 1893 a University of Wisconsin psychologist, Joseph Jastrow, found large numbers of subjects willing, out of curiosity, to submit themselves to psychological tests. By 1895 the American Psychological Association had set up a committee to promote cooperation of psychological laboratories throughout the country “in the collection of mental and physical statistics.”
THE MENTAL TEST in the United States was a by-product, too, of two twentieth-century American institutions: mass education and the mass army. Both these expressions of a democratized society encouraged quantitative ways of thinking. The expansion of American public education and the multiplication of public high schools had created a vast new body of young Americans hopeful to enter college. While entrance to high school was automatic, or even compulsory, entrance to nearly all colleges remained selective. While the enrollment in American public high schools increased more than tenfold between 1890 and 1922 (from some 200,000 to over 2 million), enrollment in American colleges and universities increased only about fourfold (from 150,000 to about 600,000). This created a demand for new techniques of selection at the very time when a science of statistics had come on the scene.
As early as 1877, President Charles W. Eliot of Harvard, who had advocated the free public high school as a way to democratize education, foresaw the need for a new system of examinations. Only by some common system of examination could the colleges select those who would profit most from higher education. In the United States, where the subjects and standards of locally controlled public education varied beyond the wildest nightmares of European educators, the problem was quite unlike that in England or France where applicants to college came from a standardized curriculum. Since the federal government itself had no responsibility for education, only standardized tests by a private central organization could do the job.
The scheme proposed in 1890 by Nicholas Murray Butler, Professor of Philosophy and Education at Columbia University, aimed incidentally to help shape school programs to college needs. “At that time,” Butler recalled, “the public secondary school had just entered upon its period of popularity and rapid growth. It was multiplying in cities and towns throughout the United States and quite naturally found itself everywhere in contact with the problem of admission to college. The colleges throughout the United States were going their several ways with sublime unconcern either for the policies of other colleges, for the needs of the secondary schools, or for the general public interest.” The recently formed College Entrance Examination Board in June 1901 gave its first examinations at sixty-seven places in the United States and two places in Europe, to 973 candidates representing 237 schools. The curricula of the new high schools had been inflated by courses for academic credit in art, bookkeeping, band, stenography, journalism, printing, Red Cross, home economics, and scores of crafts, hobbies, and sports. But the Board tested candidates in traditional academic subjects, such as English and chemistry, and graded strictly on the academic scale of 100, requiring a score of 60 to pass.
IN 1905 A FRENCH PSYCHOLOGIST, Alfred Binet, whom the Minister of Public Education in Paris had appointed to a commission on the education of retarded children, collaborated with another French psychologist, Théodore Simon, in devising a metric intelligence scale. With his novel assumption that, up to maturity, intelligence increased with age, Binet tested numerous children to establish an average level of test performance for each age. Then, by comparing any child’s own test performance with the average, Binet could distinguish a child’s “mental age” from his chronological age. Tests were standardized by assigning to each year-level the tests passed by 75 percent of the children of that age, and these tests were used for measuring variations in the intelligence of “normal” children. By 1908 the Binet-Simon tests had been translated into English and imported to the United States. Then, in 1912 a German psychologist, William Stein, proposed dividing a subject’s mental age by his actual chronological age to produce a useful “Intelligence Quotient.” Lewis Terman and others at Stanford University revised and adapted the Binet-Simon tests into the standard American “mental tests.” And the “I.Q.” became a popular handle on the whole subject.
At American entrance into World War I in 1917, the urgent need to classify recruits led to new techniques of testing. Two Army tests, both suitable for administering to large groups, were designed by a committee of the American Psychological Association to weed out the mentally incompetent at one end of the scale and to identify the brighter “officer material” at the other. The Army Alpha Test was for men literate in English, while the Army Beta Test (with directions given in pantomime) was for illiterates and for foreigners who did not understand English. By the end of January 1919, the tests had been taken by 1,726,000 men. Although the results were not officially published until 1921, the information that leaked out during 1919 caused alarm: 47.3 percent of the white draftees and 89 percent of the Negro draftees had a “mental age” of twelve years or under. By textbook definition an adult with a “mental age” of under twelve was “feeble-minded.” Was that half the American population?
The nation had only recently been aroused by respected psychologists to “the Menace of the Feeble-minded.” In fact, the most vocal propagandist against that menace, Henry H. Goddard, was himself on the committee that had designed the Army Tests. It was Goddard who had translated the Binet-Simon tests; and in 1911, after testing two thousand normal children, the whole school population of a New Jersey town, he had revised the Binet-Simon scale for American use. Goddard saw the mental tests as the gateway to utopia, opening new worlds of eugenics and social reform. Attacks on his tests, he observed, “only arouse a smile and a feeling akin to that which the physician would have for one who might launch a tirade against the value of the clinical thermometer.” As psychologist at the Vineland Training School, a penal institution for juveniles in New Jersey, he tested fifty-six “wayward girls,” aged fourteen to twenty, who were out on parole. And he found all but four to be “feeble-minded.” Then in a Newark juvenile detention home Goddard tested one hundred inmates chosen at random and found 66 percent to be “feeble-minded.” He jumped to the conclusion that feeble-mindedness was the main cause of crime.
Goddard then popularized his dogma by a sensational book, The Kallikak Family (1912), which described the far-reaching consequences of the casual union of a Revolutionary militiaman with a “feeble-minded” tavern girl. From their illegitimate son (popularly known as “Old Horror”), Goddard traced a line of degenerates, prostitutes, epileptics, alcoholics, and assorted criminals. While the public was fascinated, Goddard protested that he had only offered a “scientific” statistical summary of the clinical record. The hereditary connection of “feeble-mindedness” and crime, Goddard himself explained, had long since been suggested but had, until his own work, lacked proper scientific support. In 1877 an amateur criminologist, Richard D. Dugdale, had also told a lurid tale of hereditary delinquency. In The Jukes, A Study in Crime, Pauperism, Disease and Heredity, Dugdale claimed that a single tainted family had cost the state at least $1,308,000. In 1911 Dugdale’s manuscript was accidentally found, giving the real names of the pseudonymous “Jukes.” A researcher spent four years tracing their later careers and those of their descendants: The Jukes in 1915 concluded that half the Jukes “were and are” feeble-minded, that the family’s penchant for crime continued, and that all the Juke criminals were feeble-minded.
To such “scientific” documents as these, Goddard added still more statistics of his own showing the results of mental tests given to juvenile delinquents and adult criminals. So Goddard sparked a nationwide campaign to combat crime by controlling the feeble-minded. He received support from the Immigration Restriction League and other nativist groups, who were using the crude statistics found in the 45-volume congressional hearings on immigration to favor laws to exclude all hereditary “undesirables.”
Incidentally, about 1910 Goddard had invented and added the word “moron” (from Greek moros, stupid) to the language. As a result of his tests he recommended to the American Association for the Study of the Feebleminded a classification of adult feebleminded into “idiots” (through two years “mental age”), “imbeciles” (through seven years “mental age”), and “morons” (through twelve years “mental age”). While others debated the sharpness of Goddard’s definition, Goddard himself remained confident that his mental tests were measuring whatever needed to be measured, and that if the legislatures would only act, they could accomplish wonders of social antisepsis.
But what did it really mean, to say (as the Army tests had suggested) that half the American population was “feeble-minded”? Did this perhaps say less about the American people than about the usefulness of “mental tests”? Even before the Army tests, cautious scholars had been troubled by doubts. They suspected that the Crusade against the Feeble-minded, by concentrating on criminals and prostitutes, may have overlooked the large number of persons of similarly limited intelligence who had become respectable citizens. After the Army tests, as historian Mark Haller observes, a number of the crusaders recanted. “We have really slandered the feeble-minded,” a doctor of the Massachusetts reformatory observed in 1918, “Some of the sweetest and most beautiful characters I have ever known have been feeble-minded people.” Eventually Goddard himself wondered whether he might not have exaggerated the menace of the low I.Q.
NEVERTHELESS, AFTER WORLD WAR I the nation showed renewed enthusiasm for intelligence tests. Surviving the false alarm of the Menace of the Feeble-minded and the popular prejudice aroused by the Army tests, the new brands of intelligence tests prospered as never before. And now they reached out to classify the whole community.
After World War I the College Entrance Examination Board, undaunted by the popular ridicule of mental tests, took a new tack. Noting the narrow range of attributes which the Board had tested in the past, in 1924 the Secretary of the Board pointed the future direction:
Among the qualities in regard to which many colleges desire information and for which directtests exist, although the Board has not yet undertaken to administer such tests, are the following–
· (1) Ethical behavior
· (2) Physical health
· (3) Powers of observation
· (4) Mental alertness
· (5) Ability to participate successfully in cooperative efforts or team work
· (6) Skill in laboratory work
· (7) Facility in conversation in foreign languages
In 1926 the Board administered its first “Scholastic Aptitude Tests,” aimed at a test of “intelligence” that would be less tied to scholastic subject matters. During the next fifteen years the Board elaborated its system of examination in an effort to provide a more reliable instrument of prediction, on which applicants as well as colleges could rely. Then, in 1935, the Board revised its whole system of scaling and ceased grading on the traditional academic scale of 100. The new scale fixed the grade of the average college applicant at 500 and then distributed scores so that they ranged from about 200 to 800, in relation to the percentage of the random college-admission population above and below the average. Incidentally, these new tests had the advantage that instead of requiring costly trained readers, they could be scored mechanically.
By the outbreak of World War II the Board had developed techniques not merely for testing college-entrance subject matter, but for testing all sorts of aptitudes. On April 2, 1943, the Board administered its V-12 and A-12 tests for the armed forces to 316,000 young men at 13,000 testing centers. Experts agreed that this test of a homogeneous group was the largest and most important single exercise in the history of testing. When these tests proved successful, the Board went on to other tasks. Under a contract with the Bureau of Naval Personnel, the Board tested for 100 service jobs, which required the printing of 133 tests, answer sheets, and bulletins, to a total of 36 million pages of test materials. Then, at the end of the war, the Board administered nationwide tests for veterans returning to civilian life. Hundreds of thousands were seeking college admission under the G.I. Bill, or wanted their chance in programs sponsored by firms like Westinghouse or Pepsi-Cola, in the Foreign Service, at Annapolis or in the Coast Guard Academy.
Testing had taken on a new role that was nationwide and touched every aspect of American life. In 1947 the College Entrance Examination Board and other testing groups became part of a new Educational Testing Service. The ETS, pledged to develop “new services and new tests in areas where they are badly needed,” applied its testing know-how to classifying the nation’s personnel. “The vanishing continuity between school and college programs,” the Secretary of the Board observed in 1950, made it more necessary than ever to rely on aptitude tests, which now by themselves seemed able to predict college success. Popular pressures to equalize educational opportunity posed a new peacetime problem of numbers. In 1951 about 525,000 of each college-age group, the Board’s Secretary observed, “have an I.Q. of 110 or better and are, therefore, to be adjudged capable of doing adequate or superior college work.” But of these only 210,000 actually reached college, the remaining 315,000 failing to get there apparently “because of lack of money or lack of motivation.” While tests could presumably locate the qualified Americans who were not being educated up to their intellectual capacities, only a new nationwide system for financing higher education could get them to college. The twin objective was to avoid “wasted money” (from admission of students with lower than 110 I.Q.) and “wasted talent” (from failure to admit those with an I.Q. of 115 or better).
AFTER 1958, IN WHAT intelligence testers came to call the Sputnik Era, there was a new fear of academic waste, a frenetic quest for ways of finding talent and a short-lived new enthusiasm for academic excellence. Could Americans come up to the Russian standard? There were proposals for “inexpensive preliminary tests” on the Scholastic Aptitude pattern, proposals to create “a group of colleges which will be geared to student groups with ability levels from I.Q. 100 to I.Q. 115,” and a new emphasis on “college guidance” (which by 1960 had become a distinct profession with its own association and its own journal). In 1960, after years of controversy, it was decided to release the test scores to the candidates. This was, of course, a significant new step in reinforcing public awareness of statistical communities of the mind. The president of the College Entrance Examination Board, attacking the “taboo of silence,” boasted now that the publicity for test scores was “a triumph of morality.” He reported to an audience at Columbia Teachers College:
There was great fear that students would have their values warped by learning their own scores, but I have learned from hearing my own children’s conversation that SAT scores have now become one of the peer group measuring devices. One unfortunate may be dismissed with the phrase, “That jerk–he only made 420!” The bright, steady student is appreciated with his high 600, and the unsuspected genius with his 700 is held in awe. This is not exactly the use of College Board scores we had in mind when we decided to authorize their distribution, but it’s possible to think of many worse, so perhaps we had better not complain.
As the century wore on, objective tests, and the “quotients” they provided, became raw materials for the newly specialized profession of vocational counseling and guidance, and for personnel management. And they were used increasingly in the effort to make “merit” the criterion for jobs in the Civil Service. “Intelligence tests, and the related aptitude tests,” an expert observed in 1971, “have more and more become society’s instrument for the selection of human resources. Not only for the military, but for schools from secondary to professional, for industry, and for civil service, objective tests have cut away the traditional grounds for selection–family, social class, and, most important, money.” An increasing tendency to assimilate the nation’s population (which now had been quantitatively assessed) to the nation’s other matériel was expressed in a new tendency to describe people not as people but as “human resources.” Textbooks on personnel management began to call themselves guides to “human resources administration.”
In the late 1960’s and ‘70’s, as equalizing movements gained momentum, the quantitative approach to intelligence was attacked from another, unexpected quarter. Jefferson and his disciples had enshrined the ideal of a “natural aristocracy” in their early democratic credo. The greatness of the new nation, they said, would come from the new freedom for every man to develop his native abilities. Here, for the first time, society would have access to undeveloped “human resources” which Old World aristocracy, with its hereditary distinctions, had wasted. The new egalitarianism attacked “intelligence tests” not so much because they were culture-loaded or failed to rank people accurately according to their intelligence. The new thrust of their attack was on intelligence itself. In 1972 “The Case Against I.Q. Tests” was seriously argued by opponents of tests precisely because mental tests did measure “intelligence.” They feared that any society that preferred its more intelligent citizens would lack the utopian egalitarian virtues. “Contemporary American society uses intelligence as one of the bases for ranking its members,” a querulous critic objected. “We celebrate intelligence the way the Islamic Moroccans celebrate the warrior-saint.” The United States Supreme Court enjoined a business firm from giving intelligence tests to its potential employees because they might be used to discriminate against Negroes. And in the temporarily fashionable enthusiasm for absolute equality, the science of mental testing became one of the most controversial of the social sciences.