Language and Archaeology


The Promise and Politics of the Mother Tongue


When you look in the mirror you see not just your face but a museum. Although your face, in one sense, is your own, it is composed of a collage of features you have inherited from your parents, grandparents, great-grandparents, and so on. The lips and eyes that either bother or please you are not yours alone but are also features of your ancestors, long dead perhaps as individuals but still very much alive as fragments in you. Even complex qualities such as your sense of balance, musical abilities, shyness in crowds, or susceptibility to sickness have been lived before. We carry the past around with us all the time, and not just in our bodies. It lives also in our customs, including the way we speak. The past is a set of invisible lenses we wear constantly, and through these we perceive the world and the world perceives us. We stand always on the shoulders of our ancestors, whether or not we look down to acknowledge them.

It is disconcerting to realize how few of our ancestors most of us can recognize or even name. You have four great-grandmothers, women sufficiently close to you genetically that you see elements of their faces, and skin, and hair each time you see your reflection. Each had a maiden name she heard spoken thousands of times, and yet you probably cannot recall any one of their maiden names. If we are lucky, we may find their birth names in genealogies or documents, although war, migration, and destroyed records have made that impossible for many Americans. Our four great-grandmothers had full lives, families, and bequeathed to us many of our most personal qualities, but we have lost these ancestors so completely that we cannot even name them. How many of us can imagine being so utterly forgotten just three generations from now by our own descendents that they remember nothing of us—not even our names?

In traditional societies, where life is still structured around family, extended kin, and the village, people often are more conscious of the debts they owe their ancestors, even of the power of their ghosts and spirits. Zafimaniry women in rural Madagascar weave complicated patterns on their hats, which they learned from their mothers and aunts. The patterns differ significantly between villages. The women in one village told the anthropologist Maurice Bloch that the designs were “pearls from the ancestors.” Even ordinary Zafimaniry houses are seen as temples to the spirits of the people who made them.1 This constant acknowledgment of the power of those who lived before is not part of the thinking of most modern, consumer cultures. We live in a world that depends for its economic survival on the constant adoption and consumption of new things. Archaeology, history, genealogy, and prayer are the overflowing drawers into which we throw our thoughts of earlier generations.

Archaeology is one way to acknowledge the humanity and importance of the people who lived before us and, obliquely, of ourselves. It is the only discipline that investigates the daily texture of past lives not described in writing, indeed the great majority of the lives humans have lived. Archaeologists have wrested surprisingly intimate details out of the silent remains of the preliterate past, but there are limits to what we can know about people who have left no written accounts of their opinions, their conversations, or their names.

Is there a way to overcome those limits and recover the values and beliefs that were central to how prehistoric people really lived their lives? Did they leave clues in some other medium? Many linguists believe they did, and that the medium is the very language we use every day. Our language contains a great many fossils that are the remnants of surprisingly ancient speakers. Our teachers tell us that these linguistic fossils are “irregular” forms, and we just learn them without thinking. We all know that a past tense is usually constructed by adding -t or -ed to the verb (kick-kicked, miss-missed) and that some verbs require a change in the vowel in the middle of the stem (run-ran, sing-sang). We are generally not told, however, that this vowel change was the older, original way of making a past tense. In fact, changing a vowel in the verb stem was the usual way to form a past tense probably about five thousand years ago. Still, this does not tell us much about what people were thinking then.

Are the words we use today actually fossils of people’s vocabulary of about five thousand years ago? A vocabulary list would shine a bright light on many obscure parts of the past. As the linguist Edward Sapir observed, “The complete vocabulary of a language may indeed be looked upon as a complex inventory of all the ideas, interests, and occupations that take up the attention of the community.”2 In fact, a substantial vocabulary list has been reconstructed for one of the languages spoken about five thousand years ago. That language is the ancestor of modern English as well as many other modern and ancient languages. All the languages that are descended from this same mother tongue belong to one family, that of the Indo-European languages. Today Indo-European languages are spoken by about three billion people—more than speak the languages of any other language family. The vocabulary of the mother tongue, called “Proto-Indo-European”, has been studied for about two hundred years, and in those two centuries fierce disagreements have continued about almost every aspect of Indo-European studies.

But disagreement produces light as well as heat. This book argues that it is now possible to solve the central puzzle surrounding Proto-Indo-European, namely, who spoke it, where was it spoken, and when. Generations of archaeologists and linguists have argued bitterly about the “homeland” question. Many doubt the wisdom of even pursuing it. In the past, nationalists and dictators have insisted that the homeland was in their country and belonged to their own superior “race.” But today Indo-European linguists are improving their methods and making new discoveries. They have reconstructed the basic forms and meanings of thousands of words from the Proto-Indo-European vocabulary—itself an astonishing feat. Those words can be analyzed to describe the thoughts, values, concerns, family relations, and religious beliefs of the people who spoke them. But first we have to figure out where and when they lived. If we can combine the Proto-Indo-European vocabulary with a specific set of archaeological remains, it might be possible to move beyond the usual limitations of archaeological knowledge and achieve a much richer knowledge of these particular ancestors.

I believe with many others that the Proto-Indo-European homeland was located in the steppes north of the Black and Caspian Seas in what is today southern Ukraine and Russia. The case for a steppe homeland is stronger today than in the past partly because of dramatic new archaeological discoveries in the steppes. To understand the significance of an Indo-European homeland in the steppes requires a leap into the complicated and fascinating world of steppe archaeology. Steppe means “wasteland” in the language of the Russian agricultural state. The steppes resembled the prairies of North America—a monotonous sea of grass framed under a huge, dramatic sky. A continuous belt of steppes extends from eastern Europe on the west (the belt ends between Odessa and Bucharest) to the Great Wall of China on the east, an arid corridor running seven thousand kilometers across the center of the Eurasian continent. This enormous grassland was an effective barrier to the transmission of ideas and technologies for thousands of years. Like the North American prairie, it was an unfriendly environment for people traveling on foot. And just as in North America, the key that opened the grasslands was the horse, combined in the Eurasian steppes with domesticated grazing animals—sheep and cattle—to process the grass and turn it into useful products for humans. Eventually people who rode horses and herded cattle and sheep acquired the wheel, and were then able to follow their herds almost anywhere, using heavy wagons to carry their tents and supplies. The isolated prehistoric societies of China and Europe became dimly aware of the possibility of one another’s existence only after the horse was domesticated and the covered wagon invented. Together, these two innovations in transportation made life predictable and productive for the people of the Eurasian steppes. The opening of the steppe—its transformation from a hostile ecological barrier to a corridor of transcontinental communication—forever changed the dynamics of Eurasian historical development, and, this author contends, played an important role in the first expansion of the Indo-European languages.


The Indo-European problem was formulated in one famous sentence by Sir William Jones, a British judge in India, in 1786. Jones was already widely known before he made his discovery. Fifteen years earlier, in 1771, his Grammar of the Persian Language was the first English guide to the language of the Persian kings, and it earned him, at the age of twenty-five, the reputation as one of the most respected linguists in Europe. His translations of medieval Persian poems inspired Byron, Shelley, and the European Romantic movement. He rose from a respected barrister in Wales to a correspondent, tutor, and friend of some of the leading men of the kingdom. At age thirty-seven he was appointed one of the three justices of the first Supreme Court of Bengal. His arrival in Calcutta, a mythically alien place for an Englishman of his age, was the opening move in the imposition of royal government over a vital yet irresponsible merchant’s colony. Jones was to regulate both the excesses of the English merchants and the rights and duties of the Indians. But although the English merchants at least recognized his legal authority, the Indians obeyed an already functioning and ancient system of Hindu law, which was regularly cited in court by Hindu legal scholars, or pandits (the source of our term pundit).English judges could not determine if the laws the pandits cited really existed. Sanskrit was the ancient language of the Hindu legal texts, like Latin was for English law. If the two legal systems were to be integrated, one of the new Supreme Court justices had to learn Sanskrit. That was Jones.

He went to the ancient Hindu university at Nadiya, bought a vacation cottage, found a respected and willing pandit (Rāmalocana) on the faculty, and immersed himself in Hindu texts. Among these were the Vedas, the ancient religious compositions that lay at the root of Hindu religion. The Rig Veda, the oldest of the Vedic texts, had been composed long before the Buddha’s lifetime and was more than two thousand years old, but no one knew its age exactly. As Jones pored over Sanskrit texts his mind made comparisons not just with Persian and English but also with Latin and Greek, the mainstays of an eighteenth-century university education; with Gothic, the oldest literary form of German, which he had also learned; and with Welsh, a Celtic tongue and his boyhood language which he had not forgotten. In 1786, three years after his arrival in Calcutta, Jones came to a startling conclusion, announced in his third annual discourse to the Asiatic Society of Bengal, which he had founded when he first arrived. The key sentence is now quoted in every introductory textbook of historical linguistics (punctuation mine):

The Sanskrit language, whatever be its antiquity, is of a wonderful structure: more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either; yet bearing to both of them a stronger affinity, both in the roots of verbs and in the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists.

Jones had concluded that the Sanskrit language originated from the same source as Greek and Latin, the classical languages of European civilization. He added that Persian, Celtic, and German probably belonged to the same family. European scholars were astounded. The occupants of India, long regarded as the epitome of Asian exotics, turned out to be long-lost cousins. If Greek, Latin, and Sanskrit were relatives, descended from the same ancient parent language, what was that language? Where had it been it spoken? And by whom? By what historical circumstances did it generate daughter tongues that became the dominant languages spoken from Scotland to India?

These questions resonated particularly deeply in Germany, where popular interest in the history of the German language and the roots of German traditions were growing into the Romantic movement. The Romantics wanted to discard the cold, artificial logic of the Enlightenment to return to the roots of a simple and authentic life based in direct experience and community. Thomas Mann once said of a Romantic philosopher (Schlegel) that his thought was contaminated too much by reason, and that he was therefore a poor Romantic. It was ironic that William Jones helped to inspire this movement, because his own philosophy was quite different: “The race of man… cannot long be happy without virtue, nor actively virtuous without freedom, nor securely free without rational knowledge.”3 But Jones had energized the study of ancient languages, and ancient language played a central role in Romantic theories of authentic experience. In the 1780s J. G. Herder proposed a theory later developed by von Humboldt and elaborated in the twentieth century by Wittgenstein, that language creates the categories and distinctions through which humans give meaning to the world. Each particular language, therefore, generates and is enmeshed in a closed social community, or “folk,” that is at its core meaningless to an outsider. Language was seen by Herder and von Humboldt as a vessel that molded community and national identities. The brothers Grimm went out to collect “authentic” German folk tales while at the same time studying the German language, pursuing the Romantic conviction that language and folk culture were deeply related. In this setting the mysterious mother tongue, Proto-Indo-European, was regarded not just as a language but as a crucible in which Western civilization had its earliest beginnings.

After the 1859 publication of Charles Darwin’s The Origin of Species, the Romantic conviction that language was a defining factor in national identity was combined with new ideas about evolution and biology. Natural selection provided a scientific theory that was hijacked by nationalists and used to rationalize why some races or “folks” ruled others—some were more “fit” than others. Darwin himself never applied his theories of fitness and natural selection to such vague entities as races or languages, but this did not prevent unscientific opportunists from suggesting that the less “fit” races could be seen as a source of genetic weakness, a reservoir of barbarism that might contaminate and dilute the superior qualities of the races that were more “fit.” This toxic mixture of pseudo-science and Romanticism soon produced its own new ideologies. Language, culture, and a Darwinian interpretation of race were bundled together to explain the superior biological–spiritual–linguistic essence of the northern Europeans who conducted these self-congratulatory studies. Their writings and lectures encouraged people to think of themselves as members of long-established, biological–linguistic nations, and thus were promoted widely in the new national school systems and national newspapers of the emerging nation-states of Europe. The policies that forced the Welsh (including Sir William Jones) to speak English, and the Bretons to speak French, were rooted in politicians’ need for an ancient and “pure” national heritage for each new state. The ancient speakers of Proto-Indo-European soon were molded into the distant progenitors of such racial–linguistic–national stereotypes.4

Proto-Indo-European, the linguistic problem, became “the Proto-Indo-Europeans,” a biological population with its own mentality and personality: “a slim, tall, light-complexioned, blonde race, superior to all other peoples, calm and firm in character, constantly striving, intellectually brilliant, with an almost ideal attitude towards the world and life in general”.5 The name Aryan began to be applied to them, because the authors of the oldest religious texts in Sanskrit and Persian, the Rig Veda and Avesta, called themselves Aryans. These Aryans lived in Iran and eastward into Afghanistan–Pakistan–India. The term Aryan should be confined only to this Indo-Iranian branch of the Indo-European family. But the Vedas were a newly discovered source of mystical fascination in the nineteenth century, and in Victorian parlors the name Aryan soon spread beyond its proper linguistic and geographic confines. Madison Grant’s The Passing of the Great Race (1916), a best-seller in the U.S., was a virulent warning against the thinning of superior American “Aryan” blood (by which he meant the British–Scots–Irish–German settlers of the original thirteen colonies) through interbreeding with immigrant “inferior races,” which for him included Poles, Czechs, and Italians as well as Jews—all of whom spoke Indo-European languages (Yiddish is a Germanic language in its basic grammar and morphology).6

The gap through which the word Aryan escaped from Iran and the Indian subcontinent was provided by the Rig Veda itself: some scholars found passages in the Rig Veda that seemed to describe the Vedic Aryans as invaders who had conquered their way into the Punjab.7 But from where? A feverish search for the “Aryan homeland” began. Sir William Jones placed it in Iran. The Himalayan Mountains were a popular choice in the early nineteenth century, but other locations soon became the subject of animated debates. Amateurs and experts alike joined the search, many hoping to prove that their own nation had given birth to the Aryans. In the second decade of the twentieth century the German scholar Gustav Kossinna attempted to demonstrate on archaeological grounds that the Aryan homeland lay in northern Europe—in fact, in Germany. Kossinna illustrated the prehistoric migrations of the “Indo-Germanic” Aryans with neat black arrows that swept east, west, and south from his presumed Aryan homeland. Armies followed the pen of the prehistorian less than thirty years later.8

The problem of Indo-European origins was politicized almost from the beginning. It became enmeshed in nationalist and chauvinist causes, nurtured the murderous fantasy of Aryan racial superiority, and was actually pursued in archaeological excavations funded by the Nazi SS. Today the Indo-European past continues to be manipulated by causes and cults. In the books of the Goddess movement (Marija Gimbutas’s Civilization of the Goddess, Riane Eisler’s The Chalice and the Blade) the ancient “Indo-Europeans” are cast in archaeological dramas not as blonde heroes but as patriarchal, warlike invaders who destroyed a utopian prehistoric world of feminine peace and beauty. In Russia some modern nationalist political groups and neo-Pagan movements claim a direct linkage between themselves, as Slavs, and the ancient “Aryans.” In the United States white supremacist groups refer to themselves as Aryans. There actually were Aryans in history—the composers of the Rig Veda and the Avesta—but they were Bronze Age tribal people who lived in Iran, Afghanistan, and the northern Indian subcontinent. It is highly doubtful that they were blonde or blue-eyed, and they had no connection with the competing racial fantasies of modern bigots.9

The mistakes that led an obscure linguistic mystery to erupt into racial genocide were distressingly simple and therefore can be avoided by anyone who cares to avoid them. They were the equation of race with language, and the assignment of superiority to some language-and-race groups. Prominent linguists have always pleaded against both these ideas. While Martin Heidegger argued that some languages—German and Greek—were unique vessels for a superior kind of thought, the linguistic anthropologist Franz Boas protested that no language could be said to be superior to any other on the basis of objective criteria. As early as 1872 the great linguist Max Müller observed that the notion of an Aryan skull was not just unscientific but anti-scientific; languages are not white-skinned or long-headed. But then how can the Sanskrit language be connected with a skull type? And how did the Aryans themselves define “Aryan”? According to their own texts, they conceived of “Aryan-ness” as a religious–linguistic category. Some Sanskrit-speaking chiefs, and even poets in the Rig Veda, had names such as Balbūtha and Bbu that were foreign to the Sanskrit language. These people were of non-Aryan origin and yet were leaders among the Aryans. So even the Aryans of the Rig Veda were not genetically “pure”—whatever that means. The Rig Veda was a ritual canon, not a racial manifesto. If you sacrificed in the right way to the right gods, which required performing the great traditional prayers in the traditional language, you were an Aryan; otherwise you were not. The Rig Veda made the ritual and linguistic barrier clear, but it did not require or even contemplate racial purity.10

Any attempt to solve the Indo-European problem has to begin with the realization that the term Proto-Indo-European refers to a language community, and then work outward. Race really cannot be linked in any predictable way with language, so we cannot work from language to race or from race to language. Race is poorly defined; the boundaries between races are defined differently by different groups of people, and, since these definitions are cultural, scientists cannot describe a “true” boundary between any two races. Also, archaeologists have their own, quite different definitions of race, based on traits of the skull and teeth that often are invisible in a living person. However race is defined, languages are not normally sorted by race—all racial groups speak a variety of different languages. So skull shapes are almost irrelevant to linguistic problems. Languages and genes are correlated only in exceptional circumstances, usually at clear geographic barriers such as significant mountain ranges or seas—and often not even there.11 A migrating population did not have to be genetically homogeneous even if it did recruit almost exclusively from a single dialect group. Anyone who assumes a simple connection between language and genes, without citing geographic isolation or other special circumstances, is wrong at the outset.


The only aspect of the Indo-European problem that has been answered to most peoples’ satisfaction is how to define the language family, how to determine which languages belong to the Indo-European family and which do not. The discipline of linguistics was created in the nineteenth century by people trying to solve this problem. Their principal interests were comparative grammar, sound systems, and syntax, which provided the basis for classifying languages, grouping them into types, and otherwise defining the relationships between the tongues of humanity. No one had done this before. They divided the Indo-European language family into twelve major branches, distinguished by innovations in phonology or pronunciation and in morphology or word form that appeared at the root of each branch and were maintained in all the languages of that branch (figure 1.1). The twelve branches of Indo-European included most of the languages of Europe (but not Basque, Finnish, Estonian, or Magyar); the Persian language of Iran; Sanskrit and its many modern daughters (most important, Hindi and Urdu); and a number of extinct languages including Hittite in Anatolia (modern Turkey) and Tocharian in the deserts of Xinjiang (northwestern China) (figure 1.2). Modern English, like Yiddish and Swedish, is assigned to the Germanic branch. The analytic methods invented by nineteenth-century philologists are today used to describe, classify, and explain language variation worldwide.

Figure 1.1 The twelve branches of the Indo-European language family. Baltic and Slavic are sometimes combined into one branch, like Indo-Iranian, and Phrygian is sometimes set aside because we know so little about it, like Illyrian and Thracian. With those two changes the number of branches would be ten, an acceptable alternative. A tree diagram is meant to be a sketch of broad relationships; it does not represent a complete history.

Historical linguistics gave us not just static classifications but also the ability to reconstruct at least parts of extinct languages for which no written evidence survives. The methods that made this possible rely on regularities in the way sounds change inside the human mouth. If you collect Indo-European words for hundred from different branches of the language family and compare them, you can apply the myriad rules of sound change to see if all of them can be derived by regular changes from a single hypothetical ancestral word at the root of all the branches. The proof that Latin kentum (hundred) in the Italic branch and Lithuanian shimtas (hundred) in the Baltic branch are genetically related cognates is the construction of the ancestral root *k’tom-. The daughter forms are compared sound by sound, going through each sound in each word in each branch, to see if they can converge on one unique sequence of sounds that could have evolved into all of them by known rules. (I explain how this is done in the next chapter.) That root sequence of sounds, if it can be found, is the proof that the terms being compared are genetically related cognates. A reconstructed root is the residue of a successful comparison.

Figure 1.2 The approximate geographic locations of the major Indo-European branches at about 400 BCE.

Linguists have reconstructed the sounds of more than fifteen hundred Proto-Indo-European roots.12 The reconstructions vary in reliability, because they depend on the surviving linguistic evidence. On the other hand, archeological excavations have revealed inscriptions in Hittite, Mycenaean Greek, and archaic German that contained words, never seen before, displaying precisely the sounds previously reconstructed by comparative linguists. That linguists accurately predicted the sounds and letters later found in ancient inscriptions confirms that their reconstructions are not entirely theoretical. If we cannot regard reconstructed Proto-Indo-European as literally “real,” it is at least a close approximation of a prehistoric reality.

The recovery of even fragments of the Proto-Indo-European language is a remarkable accomplishment, considering that it was spoken by nonliterate people many thousands of years ago and never was written down. Although the grammar and morphology of Proto-Indo-European are most important in typological studies, it is the reconstructed vocabulary, or lexicon, that holds out the most promise for archaeologists. The reconstructed lexicon is a window onto the environment, social life, and beliefs of the speakers of Proto-Indo-European.

For example, reasonably solid lexical reconstructions indicate that Proto-Indo-European contained words for otter, beaver, wolf, lynx, elk, red deer, horse, mouse, hare, and hedgehog, among wild animals; goose, crane, duck, and eagle, among birds; bee and honey; and cattle (also cow, ox, and steer), sheep (also wool and weaving), pig (also boar, sow, and piglet), and dog among the domestic animals. The horse was certainly known to the speakers of Proto-Indo-European, but the lexical evidence alone is insufficient to determine if it was domesticated. All this lexical evidence might also be attested in, and compared against, archaeological remains to reconstruct the environment, economy, and ecology of the Proto-Indo-European world.

But the proto-lexicon contains much more, including clusters of words, suggesting that the speakers of PIE inherited their rights and duties through the father’s bloodline only (patrilineal descent); probably lived with the husband’s family after marriage (patrilocal residence); recognized the authority of chiefs who acted as patrons and givers of hospitality for their clients; likely had formally instituted warrior bands; practiced ritual sacrifices of cattle and horses; drove wagons; recognized a male sky deity; probably avoided speaking the name of the bear for ritual reasons; and recognized two senses of the sacred (“that which is imbued with holiness” and “that which is forbidden”). Many of these practices and beliefs are simply unrecoverable through archaeology. The proto-lexicon offers the hope of recovering some of the details of daily ritual and custom that archaeological evidence alone usually fails to deliver. That is what makes the solution of the Indo-European problem important for archaeologists, and for all of us who are interested in knowing our ancestors a little better.


Linguists have been working on cultural-lexical reconstructions of Proto-Indo-European for almost two hundred years. Archaeologists have argued about the archaeological identity of the Proto-Indo-European language community for at least a century, probably with less progress than the linguists. The problem of Indo-European origins has been intertwined with European intellectual and political history for considerably more than a century. Why hasn’t a broadly acceptable union between archaeological and linguistic evidence been achieved?

Six major problems stand in the way. One is that the recent intellectual climate in Western academia has led many serious people to question the entire idea of proto-languages. The modern world has witnessed increasing cultural fusion in music (Black Ladysmith Mombasa and Paul Simon, Pavarotti and Sting), in art (Post-Modern eclecticism), in information services (News-Gossip), in the mixing of populations (international migration is at an all-time high), and in language (most of the people in the world are now bilingual or trilingual). As interest in the phenomenon of cultural convergence increased during the 1980s, thoughtful academics began to reconsider languages and cultures that had once been interpreted as individual, distinct entities. Even standard languages began to be seen as creoles, mixed tongues with multiple origins. In Indo-European studies this movement sowed doubt about the very concept of language families and the branching tree models that illustrated them, and some declared the search for any proto-language a delusion. Many ascribed the similarities between the Indo-European languages to convergence between neighboring languages that had distinct historical origins, implying that there never was a single proto-language.13

Much of this was creative but vague speculation. Linguists have now established that the similarities between the Indo-European languages are not the kinds of similarities produced by creolization and convergence. None of the Indo-European languages looks at all like a creole. The Indo-European languages must have replaced non–Indo-European languages rather than creolizing with them. Of course, there was inter-language borrowing, but it did not reach the extreme level of mixing and structural simplification seen in all creoles. The similarities that Sir William Jones noted among the Indo–European languages can only have been produced by descent from a common proto-language. On that point most linguists agree.

So we should be able to use the reconstructed Proto-Indo-European vocabulary as a source of clues about where it was spoken and when. But then the second problem arises: many archaeologists, apparently, do not believe that it is possible to reliably reconstruct any portion of the Proto-Indo-European lexicon. They do not accept the reconstructed vocabulary as real. This removes the principal reason for pursuing Indo-European origins and one of the most valuable tools in the search. In the next chapter I offer a defense of comparative linguistics, a brief explanation of how it works, and a guide to interpreting the reconstructed vocabulary.

The third problem is that archaeologists cannot agree about the antiquity of Proto-Indo-European. Some say it was spoken in 8000 BCE, others say as late as 2000 BCE, and still others regard it as an abstract idea that exists only in linguists’ heads and therefore cannot be assigned to any one time. This makes it impossible, of course, to focus on a specific era. But the principal reason for this state of chronic disagreement is that most archaeologists do not pay much attention to linguistics. Some have proposed solutions that are contradicted by large bodies of linguistic evidence. By solving the second problem, regarding the question of reliability and reality, we will advance significantly toward solving problem number 3—the question of when—which occupies chapter 3 andchapter 4.

The fourth problem is that archaeological methods are underdeveloped in precisely those areas that are most critical for Indo-European origin studies. Most archaeologists believe it is impossible to equate prehistoric language groups with archaeological artifacts, as language is not reflected in any consistent way in material culture. People who speak different languages might use similar houses or pots, and people who speak the same language can make pots or houses in different ways. But it seems to me that language and culture are predictably correlated under some circumstances. Where we see a very clear material-culture frontier—not just different pots but also different houses, graves, cemeteries, town patterns, icons, diets, and dress designs—that persists for centuries or millennia, it tends also to be a linguistic frontier. This does not happen everywhere. In fact, such ethno–linguistic frontiers seem to occur rarely. But where a robust material-culture frontier does persist for hundreds, even thousands of years, language tends to be correlated with it. This insight permits us to identify at least some linguistic frontiers on a map of purely archaeological cultures, which is a critical step in finding the Proto-Indo-European homeland.

Another weak aspect of contemporary archaeological theory is that archaeologists generally do not understand migration very well, and migration is an important vector of language change—certainly not the only cause but an important one. Migration was used by archaeologists before World War II as a simple explanation for any kind of change observed in prehistoric cultures: if pot type A in level one was replaced by pot type B in level two, then it was a migration of B-people that had caused the change. That simple assumption was proven to be grossly inadequate by a later generation of archaeologists who recognized the myriad internal catalysts of change. Shifts in artifact types were shown to be caused by changes in the size and complexity of social gatherings, shifts in economics, reorganization in the way crafts were managed, changes in the social function of crafts, innovations in technology, the introduction of new trade and exchange commodities, and so on. “Pots are not people” is a rule taught to every Western archaeology student since the 1960s. Migration disappeared entirely from the explanatory toolkit of Western archaeologists in the 1970s and 1980s. But migration is a hugely important human behavior, and you cannot understand the Indo-European problem if you ignore migration or pretend it was unimportant in the past. I have tried to use modern migration theory to understand prehistoric migrations and their probable role in language change, problems discussed in chapter 6.

Problem 5 relates to the specific homeland I defend in this book, located in the steppe grasslands of Russia and Ukraine. The recent prehistoric archaeology of the steppes has been published in obscure journals and books, in languages understood by relatively few Western archaeologists, and in a narrative form that often reminds Western archaeologists of the old “pots are people” archaeology of fifty years ago. I have tried to understand this literature for twenty-five years with limited success, but I can say that Soviet and post-Soviet archaeology is not a simple repetition of any phase of Western archaeology; it has its own unique history and guiding assumptions. In the second half of this book I present a selective and unavoidably imperfect synthesis of archaeology from the Neolithic, Copper, and Bronze Ages in the steppe zone of Russia, Ukraine, and Kazakhstan, bearing directly on the nature and identity of early speakers of Indo-European languages.

Horses gallop onstage to introduce the final, sixth problem. Scholars noticed more than a hundred years ago that the oldest well-documented Indo-European languages—Imperial Hittite, Mycenaean Greek, and the most ancient form of Sanskrit, or Old Indic—were spoken by militaristic societies that seemed to erupt into the ancient world driving chariots pulled by swift horses. Maybe Indo-European speakers invented the chariot. Maybe they were the first to domesticate horses. Could this explain the initial spread of the Indo-European languages? For about a thousand years, between 1700 and 700 BCE, chariots were the favored weapons of pharaohs and kings throughout the ancient world, from Greece to China. Large numbers of chariots, in the dozens or even hundreds, are mentioned in palace inventories of military equipment, in descriptions of battles, and in proud boasts of loot taken in warfare. After 800 BCE chariots were gradually abandoned as they became vulnerable to a new kind of warfare conducted by disciplined troops of mounted archers, the earliest cavalry. If Indo-European speakers were the first to have chariots, this could explain their early expansion; if they were the first to domesticate horses, then this could explain the central role horses played as symbols of strength and power in the rituals of the Old Indic Aryans, Greeks, Hittites, and other Indo-European speakers.

But until recently it has been difficult or impossible to determine when and where horses were domesticated. Early horse domestication left very few marks on the equine skeleton, and all we have left of ancient horses is their bones. For more than ten years I have worked on this problem with my research partner, and also my wife, Dorcas Brown, and we believe we now know where and when people began to keep herds of tamed horses. We also think that horseback riding began in the steppes long before chariots were invented, in spite of the fact that chariotry preceded cavalry in the warfare of the organized states and kingdoms of the ancient world.


The people who spoke the Proto-Indo-European language lived at a critical time in a strategic place. They were positioned to benefit from innovations in transport, most important of these the beginning of horseback riding and the invention of wheeled vehicles. They were in no way superior to their neighbors; indeed, the surviving evidence suggests that their economy, domestic technology, and social organization were simpler than those of their western and southern neighbors. The expansion of their language was not a single event, nor did it have only one cause.

Nevertheless, that language did expand and diversify, and its daughters—including English—continue to expand today. Many other language families have become extinct as Indo-European languages spread. It is possible that the resultant loss of linguistic diversity has narrowed and channeled habits of perception in the modern world. For example, all Indo-European languages force the speaker to pay attention to tense and number when talking about an action: you must specify whether the action is past, present, or future; and you must specify whether the actor is singular or plural. It is impossible to use an Indo-European verb without deciding on these categories. Consequently speakers of Indo-European languages habitually frame all events in terms of when they occurred and whether they involved multiple actors. Many other language families do not require the speaker to address these categories when speaking of an action, so tense and number can remain unspecified.

On the other hand, other language families require that other aspects of reality be constantly used and recognized. For example, when describing an event or condition in Hopi you must use grammatical markers that specify whether you witnessed the event yourself, heard about it from someone else, or consider it to be an unchanging truth. Hopi speakers are forced by Hopi grammar to habitually frame all descriptions of reality in terms of the source and reliability of their information. The constant and automatic use of such categories generates habits in the perception and framing of the world that probably differ between people who use fundamentally different grammars.14 In that sense, the spread of Indo-European grammars has perhaps reduced the diversity of human perceptual habits. It might also have caused this author, as I write this book, to frame my observations in a way that repeats the perceptual habits and categories of a small group of people who lived in the western Eurasian steppes more than five thousand years ago.



If you find an error or have any questions, please email us at Thank you!