The Gestural Origin of Language

Michael C Corballis. American Scientist. Volume 87, Issue 2. Mar/Apr 1999.

In 1934 the behavioral psychologist B. F. Skinner found himself seated at the dinner table with the eminent philosopher Alfred North Whitehead, and proceeded to explain to Whitehead what behaviorism was all about. Obliged to offer a challenge, Whitehead uttered the sentence “No black scorpion is falling upon this table” and then asked Skinner to explain why he might have said that. Skinner attempted a reply more than 20 years later in an appendix to his 1957 book Verbal Behavior. He proposed that Whitehead was unconsciously expressing a fear of behaviorism, likening it to a black scorpion that he would not allow to intrude into his philosophy. The skeptical reader may be forgiven for concluding that the reply owed more to psychoanalysis than to behavioral principles.

Be that as it may, Whitehead had articulated one of the properties of language that seems to distinguish it from all other forms of communication, its generativity. Whereas other forms of communication among animals seem to be limited to a relatively small number of signals, and restricted to limited contexts, there is essentially no limit to the number of ideas or propositions that we can convey using sentences. We can immediately understand sentences made up of words that we have never heard in combination before, as Whitehead’s sentence illustrates. Language also allows us to escape from the immediate present and to refer to events in other places and at other times. We can use language to fantasize, to describe events that have never existed and never will. This remarkable flexibility is achieved at least in part through the human invention of grammar, a recursive set of rules that allows us to generate sentences of any desired complexity The eminent linguist Noam Chomsky has attributed this to a unique human endowment that he calls universal grammar. All human languages, he suggests, are variants on this fundamental endowment.

There has nevertheless been considerable progress in teaching something resembling language to captive apes. For example, a young pygmy chimpanzee called Kanzi, studied by Sue Savage-Rumbaugh of Georgia State University, has shown an impressive ability to use symbols on a computerized keyboard in a language-like way and even to understand moderately complex commands spoken in English. Nevertheless, the “utterances” that Kanzi produces typically consist of no more than two or three symbols strung together, sometimes in novel combinations, demonstrating a grammatical capacity that approximates that of a twoyear-old human. Children go on to acquire a sophisticated, recursive grammar that is far beyond anything Kanzi, or any other ape, has mastered. There is little doubt that Kanzi and other great apes (and perhaps other species such as dolphins) can use symbols to represent actions and objects in the real world, but they lack nearly all of the other ingredients of true language. As Steven Pinker, a psychologist at the Massachusetts Institute of Technology, remarks in his 1994 book The Language Instinct, the apes just don’t “get it.”

Since the common ancestor of human beings and chimpanzees lived some 5 million years ago, it is a reasonable inference that grammatical language must have evolved in the hominid line at some point following the split from the line that led to the modern chimpanzee. There has been much disagreement as to when this might have happened. Some linguists, such as Derek Bickerton of the University of Hawaii, have supposed that it is impossible to conceive of grammar as having been formed incrementally; it must therefore have evolved as a single catastrophic event, probably late in hominid evolution. Indeed, Bickerton and others have suggested that it may have coincided with the emergence of Homo sapiens in Africa a mere 150,000 or so years ago. It might explain why H. sapiens came to dominate and ultimately replace all other hominid species, such as the Neanderthals in Europe or Homo erectus in Southeast Asia. Philip Lieberman of Brown University has also argued, on the basis of fossil evidence, that the vocal apparatus necessary to support articulate speech did not emerge until late in hominid evolution, and that even the Neanderthals, who survived until about 30,000 years ago, would have been severely challenged vocally In his 1998 book Eve Spoke: Human Language and Human Evolution, he too argues that it was language that distinguished our own species from all other hominids. Attempts to trace present-day languages back to the original mother tongue (also known as ProtoWorld) also suggest a recent origin rather than one that antedates H. sapiens.

The position that language is a recent invention begs the question of whether the vocalizations of living primates-the alarm calls of monkeys and the hooting of apes, for example-are in some form related to human spoken language. Presumably our ancestors were also capable of such vocalizations, so why couldn’t these calls have evolved into what we recognize as language? The strongest argument against this scenario is that human language and primate vocalizations are fundamentally very different phenomena. As Chomsky observed in his 1966 book Cartesian Linguistics, human speech is unbounded in its capacity to express thought and in its freedom from stimulus control, whereas animal communication systems either consist of a fixed number of signals or a fixed number of “linguistic dimensions,” each associated with a nonlinguistic dimension. Peter MacNeilage of the University of Texas at Austin has also noted that primate vocalizations are “holistic,” containing a message in themselves, whereas human vocalizations can be combined in novel ways to create a message. In my view, it seems more likely that the call-like vocalizations of our ancestors have persisted in the emotional cries of modern human beings-such as crying, laughing and screamingrather than in speech.

Yet it is difficult to accept that an accomplishment as complex as human language could have evolved as an allor-none event-a “big bang,” as it were-late in the evolution of our species. Steven Pinker and his colleague Paul Bloom, now at the University of Arizona, argue that it must have evolved gradually, shaped by natural selection. Some primatologists, such as Richard W Byrne of the University of St. Andrews, argue that the cognitive prerequisites of language (such as the ability to adopt the mental perspective of another individual) are present in the great apes, and they therefore antedated the split of our hominid ancestors from the chimpanzee line, probably by several million years.

How are we to reconcile these alternative perspectives? At least a partial answer is that language emerged not from vocalization, but from manual gestures, and switched to a vocal mode relatively recently in hominid evolution, perhaps with the emergence of H. sapiens. This idea was suggested by the 17th-century French philosopher Etienne Condillac and revived in the 1970s by the American anthropologist Gordon W. Hewes. It has not found general favor among linguists or anthropologists, perhaps because it lacks parsimony and because there is no direct evidence that any of our hominid ancestors gestured rather than spoke. Even so, argument in its favor has continued to grow.

Primate Preadaptations?

Consider first some facts about primate evolution. Primates are largely visual animals, perhaps as a result of a common adaptation to visual predation. In people as in monkeys, vision is much more highly developed than any other sense modality, including hearing. Further, with the exception of humans, primates have much better cortical control over movements of the hands than over vocalization, which is largely restricted to emotionally based sounds controlled by subcortical structures. This means that the early hominids would have been much better preadapted for expressive, voluntary communication using the hands, and it perhaps explains why attempts to teach chimpanzees a version of sign language have been much more successful than attempts to teach them anything resembling human vocal language. In one early study, for instance, a chimpanzee raised in a human family could learn to speak only three or four words, whereas gorillas (such as Koko at the Gorilla Foundation) and chimpanzees (such as Washoe and Tatu, now at the Chimpanzee and Human Communication Institute) have been successfully taught several hundred manual signs to represent different objects or actions.

Reciprocity of gesture, as a precursor to language, may go back even further, to our common origins with apes and monkeys perhaps 25 or 30 million years ago. Giacomo Rizzolatti and his colleagues at the University of Parma in Italy have recorded from single neurons (in the premotor cortex of monkeys) that are active when the monkeys make particular gestures of reaching. Some of these neurons, which Rizzolatti and his colleagues have dubbed mirror neurons, are also active when the monkeys watch a person (or, presumably, another monkey) making the same gesture. These cells are in an area of the monkey cortex that appears to be homologous to Broca’s area in the human brain, which is critically involved in the programming of human speech. Mirror neurons presumably have more to do with giving and receiving items of food than with language, but Rizzolatti and Michael A. Arbib, a computer scientist at the University of Southern California, have proposed that they are preadaptations for the subsequent elaboration of language. They may also be precursors of the ability to take the mental perspective of others, which Byrne and others have regarded as necessary prerequisites for language.

Freedom of Hand

Nonhuman primates are nevertheless restricted in the use of the hands for communication, since the hands and arms are also critically involved in postural support and locomotion. Most primates are adapted to life in the trees, using their arms for holding onto branches and for swinging from branch to branch. The larger-bodied apes are more terrestrial but move in four-legged fashion over open terrain. Chimpanzees and gorillas, our closest relatives among the primates, adopt a form of locomotion known as knuckle walking, in which the upper body is supported by the knuckles. In contrast, a primary characteristic of the hominid line (going back at least 4 million years) is bipedalism, involving an upright stance in which the hands and arms are largely freed from any involvement in posture or locomotion. This would surely have given a significant boost to their use for a variety of other activities, including expressive communication. Just why bipedalism evolved has been a matter of much speculation. Among the suggested advantages that may have led to its selection are the freeing of the hands for tool use or for carrying things, but expressive communication may have also played a pivotal role.

The split between the hominids and the great apes may have been forged by the formation of the Great Rift Valley in Africa. The apes that were to become hominids were largely confined to the east of this valley. The recent discovery of a 3.5-million-year-old australopithecine fossil in Chad, which is well to the west of the valley, has raised some doubts about the so-called “EastSide Story.” Nevertheless, all other hominid fossils dated from just over 4 million to just under 2 million years ago have been found to the east, where forested areas gave way to open savanna-like territory. In this environment the early hominids would have been especially vulnerable to attack from much more specialized and effective hunters and killers, the precursors of modern tigers, lions and hyenas. This may have led to selection for enhanced social cooperation and cohesiveness, in which efficient communication would have been especially important.

In such an environment, gestural communication would be much more effective than vocal communication. First, it is silent, so there is little risk of alerting predators or prey to one’s presence. Gesture allows stealth. Second, it is fundamentally spatial, and much of the information to be communicated would be spatial, such as the whereabouts of dangerous predators, easy prey or carcasses to be scavenged. It may be that pointing was among the earliest communicative gestures on the savanna. Indeed, young children learn to point very early on in development, whereas other primates never point. Merlin Donald of Queens University in Kingston, Ontario, has suggested that early communication was based on mime, involving the whole body rather than just the hands and arms, and lives on in modern dance and “body language.” Donald’s view is that this form of communication is distinct from language, whereas my own view is that it was a precursor. Whatever the case, there is little question that gestural language is more iconic than vocal language. That is, gestural language more directly captures the actual shapes of things and their dispositions in space. Given the preadaptation to voluntary control of the upper limbs and to the mapping of manual action onto the perception of action, it would have been natural for our early ancestors to develop gesture rather than vocalization for the purposes of intentional communication.

If the earliest language were indeed gestural, this would help to explain one of the mysteries of the evolution of speech: how words came to represent objects and events in arbitrary fashion. Words are abstract rather than iconic. With very few exceptions, such as onomatopoeic words like “buzz” or “shriek” or Tennyson’s “murmuring of innumerable bees,” there is nothing in the actual sound of a word that gives a clue as to its meaning. It has been argued that the earliest words did in fact mimic their referents, a notion pejoratively dubbed the “bow-wow theory” by the 19th-century Oxford philologist Max Miller. But this is regarded as rather implausible, not least because spoken language is unidimensional, structured in time and not space, whereas critical events in our world are four-dimensional, structured in time and space. This restriction does not apply to manual gestures, which might well have emerged from early attempts to physically mimic the physical world. But what may have begun as an iconic system could plausibly have evolved more abstract properties over time, and at some point arbitrary patterns of sound may have been linked to gestures that may themselves have become abstract symbols rather than icons.

Gestural Language Today

Gestures are not simply figments from some imaginary past. David McNeill of the University of Chicago has shown that manual gestures are intricately interwoven into our present-day speech patterns. Speech carries the burden of grammar and most of the load of symbolic representation so that we can listen to taped lectures or listen to the radio with little loss of information. Nevertheless, gesture supplies a visual, iconic component that can provide extra information or circumvent prolonged explanation. Ask someone to tell you what a spiral is or to tell you the size of the fish they claim to have caught. Moreover, people naturally resort to manual gestures when trying to communicate with people who speak a different language. Susan Goldin-Meadow and colleagues at the University of Chicago have shown that gesture quickly takes on a grammatical role if people are prevented from speaking.

Gestural languages have been observed in religious communities pledged to vows of silence as well as in other situations where speech is difficult or forbidden. Among the most intricate of living manual languages are those invented by Aboriginal Australians. These do not provide direct evidence that gestural language preceded vocal language since they are in fact based on vocal language, but they can function autonomously and are fully grammatical. They appear to have originated in the North Central Desert of Australia and spread from there. They are used in part to overcome speech taboos, which are observed by women in the North Central Desert following the death of a close relative, and are also imposed on male novices in initiation. Sign languages have also been widely used by the Plains Indians of North America, where they seem to have served mainly to allow tribes who spoke different languages to communicate with one another.

The most abundant and extensively studied manual languages are the sign languages invented by the deaf. Sign languages were not recognized as legitimate languages until the late 18th century, beginning in France. In 1864 the U.S. Congress passed legislation recognizing a national deaf mute college that was later named Gallaudet College (now Gallaudet University) after its first principal. In the late l9th century, however, there was a strong reaction against the use of sign language, and often fruitless efforts were made to teach the deaf to speak. At the International Congress of Educators of the Deaf, held in Milan in 1880, there was a vote in favor of “oralism” (teaching the deaf to lipread or make use of whatever residual hearing they may have), and sign language was declared officially prohibited. This attitude persisted until recently, with the consequence that deaf children typically left school with drastically reduced literary and pedagogical skills.

The tide began to turn again in the late 1950s, largely through the efforts of William C. Stokoe, a professor (now retired) at Gallaudet University. Even though sign language was not properly recognized even there, Stokoe saw that the students nevertheless used it abundantly and that it had the characteristics of a true language. Stokoe and others have amply confirmed this in research, and American Sign Language (ASL) is now the recognized language at Gallaudet University. Students are taught all the usual subjects-mathematics, chemistry, philosophy and even poetry-without a word being spoken. There are countless different sign languages invented by deaf people all over the world, and there is little doubt that they are genuine languages with fully developed grammars. The spontaneous emergence of sign languages among deaf communities everywhere confirms that gestural communication is as “natural” to the human condition as is spoken language. Indeed, children exposed from an early age only to sign language go through the same basic stages of acquisition as children learning to speak, including a stage when they “babble” silently in sign!

If there is truth to Chomsky’s notion of “universal grammar,” it seems to apply as much to sign language as to spoken language. In a study recently conducted by Susan Goldin-Meadow and Carolyn Mylander of the University of Chicago, eight deaf children born to hearing parents were found to have created sign languages that were much more sophisticated than the primitive gestures used by their parents to teach them. Four of the children were raised in the United States and four in China, yet there was much more in common between the children’s sign languages across the two cultures than between the signing of the children and their parents. The children of both countries spontaneously produced complex sentences (expressing more than one proposition), and they ordered their gestures in a similar fashion. The children also spontaneously adopted an ergative structure to their signing, in which intransitive actors are distinguished from transitive ones. For example, the word “mouse” is an intransitive actor in the sentence “The mouse goes to the hole,” whereas it is a transitive actor in the sentence “The mouse eats the cheese.” In an ergative language, the sign for mouse would be different in the two contexts. The Chinese and English languages make no such distinctions. Such studies give quite strong support to the idea that there is an inborn component to children’s development of language, whether spoken or signed.

The innate capacity of human beings to communicate by gesture is also evident in a study of the congenitally blind-people who could not possibly have acquired the habit by observing others. Goldin-Meadow, this time with her colleague Jana Iverson at Indiana University, observed that 12 blind speakers gestured as they spoke at the same rate as a group of sighted people, conveying the same information and using the same range of gesture forms! (For example, a tilted C-shaped hand in the air was used to indicate that a liquid had been poured from a container.) Remarkably, the blind people would gesture while they spoke regardless of whether the listener was sighted or not, suggesting that gestures are tightly coupled to the act of speaking. Such coupling has its origins in the brain.

Gesture on the Brain

Like spoken language, sign language among deaf people appears to depend critically on the left side of the brain. Indeed, left-sided brain damage can produce deficits in signing that parallel the deficits in spoken language. For example, anterior lesions in the vicinity of Broca’s area produce deficits in expressive signing, whereas more posterior lesions result in deficits in the comprehension of signing. In a study of brain activity as measured by functional magnetic resonance imaging, Helen J. Neville and her colleagues of the University of Oregon have confirmed that both Broca’s and Wernicke’s areas, the two main language-mediating areas in the left side of the brain, are activated in deaf signers while they watch sentences in ASL. This activity was similar to that observed in hearing people when they listened to spoken sentences. Unexpectedly, the deaf showed much more activity in the right side of the brain than did the hearing people, perhaps reflecting the more prominent spatial component (which is known to be predominantly a right-hemisphere function).

A gestural origin for language may well explain the close association between handedness and the cerebral asymmetry for language. Nearly all right-handers are dominant in the left cerebral hemisphere for speech. Among left-handers the relation is less clear, with perhaps 60 percent being left dominant for speech and the remainder about equally divided between those who are right dominant and those with bilateral representation in the brain. Doreen Kimura, now at Simon Frasier University, has also noted that right-handers tend to gesture with their right hands when they speak, whereas left-handers tend to gesture with both hands. A possible explanation for this pattern of association, suggested by genetic models of handedness proposed independently by Marian Annett of the University of Leicester and Christopher McManus of University College London, is that there may be an allele (a variant of a gene) predisposing left-brain dominance for both speech and hand control. In those lacking this allele, handedness becomes a matter of chance, and handedness and cerebral dominance for speech are decoupled. Selection of the left-brain dominance gene may well have occurred during a period in hominid evolution when vocalization began to emerge as an accompaniment to gesture, eventually to replace it as the dominant mode.

Of course, there are other interpretations of the anatomical relation between hand gestures and the language areas in the brain. Elizabeth Bates of the University of California at San Diego suggests that language is a parasitic system that is overlaid on areas of the brain that originally evolved to do more basic kinds of sensorimotor work. Indeed, the areas that serve language continue to do the same non-linguistic work today: “They have not given up their day job,” according to Bates. This would include the motor areas of the frontal cortex, together with sensory regions that mediate both the perception of sound and the multiple experiences that go into what we call “meaning.” From this sensorimotor point of view, language and gesture are planned and executed together all the time because they are running off of the same neural systems, and the planning of language inevitably “leaks” over into gesture, which is a by-product.

In this view, hand gestures are something of a “fifth wheel” in the evolution of language-only along for the ride. I would argue, however, that the richness of human sign languages and hand gestures belies such a superfluous evolutionary origin for these phenomena. Gestures are not haphazardly associated with speech; they can convey information in a systematic way. Indeed, in sign languages, gestures carry information completely free of the spoken word. Rather than being a “fifth wheel,” perhaps gestures are the remains of the “unicycle” on which language first evolved.

From Hand to Mouth

If language originated in manual gestures, why do modern-day human beings speak? Although the early hominids would have been much better preadapted to manual communication, and silent signs may have been preferred on the savanna, there were surely eventual advantages to switching to vocalization. For one thing, speech can be carried on in the dark, or when obstacles prevent communicating parties from viewing one another, or over relatively long distances. Goldin-Meadow and her colleagues have made the further point that if hands and voice are to share the burden of communication, it is more efficient for syntax to carry the grammatical component, leaving the iconic component to the hands, than to have the hands carry syntax as well as meaning. More importantly, perhaps, speech would have freed the hands yet again, allowing our ancestors to verbally instruct others in manual arts, such as the use and manufacture of tools, while at the same time demonstrating them.

It is unlikely that the switch from gestural to spoken language was sudden. Vocalized grunts and squeals would surely have punctuated early gestural language, just as gestures embellish modern vocal language-especially, one is tempted to observe, among the Italians. But vocal communication would have required extensive alterations to the vocal tract, as well as a shift from predominantly subcortical to cortical control over vocalization. As Philip Lieberman has argued, these alterations were probably not complete until relatively late in hominid evolution, perhaps only with the emergence of H. sapiens some 100,000 to 150,000 years ago. Moreover, spoken language requires more precise programming of motor sequences, since the message depends exclusively on temporal sequencing. In sign language, by contrast, grammatical information is often conveyed in parallel with other aspects of the message. For example, a sentence may be converted from a statement to a question by raising the eyebrows while the sentence is signed or into a negative by shaking the head. The pressure for accurate timing in the speech signal may well have sharpened selection for unilateral control, since any relaying of neural information between the two sides of the brain would result in some loss of temporal precision.

According to this scenario it was not language that uniquely characterized H. sapiens, but rather the shift to a form of expression in which vocalization could carry the primary communicative burden, including the all-important grammatical component. It may indeed have been “Eve” who spoke, but her predecessors probably communicated effectively with a combination of gestures and vocalizations.

Language may well have begun to evolve as a generative, grammatical system from the emergence of the genus Homo over 2 million years ago. In early Homo we see the first clear signs of an increase in brain size and the first evidence for manufactured stone tools. But although stone tools became somewhat more sophisticated in succeeding species of Homo, technology appears to have been remarkably static for almost two million years-held back, perhaps, by the involvement of the hands in communication. By the reckoning of some, it was not until the Aurignacian period in Europe, beginning some 40,000 years ago, that tools began to display a greater variety and sophistication. However, the emergence of sophisticated technology in Europe appears to have coincided with the arrival in Europe of H. sapiens, which eventually replaced the incumbent Neanderthals. Recent discoveries, including 90,000-year-old artifacts made of bone in Zaire, suggest that the technological revolution began much earlier and may have been initiated by H. sapiens in Africa.

It is now considered likely that H. sapiens emerged in Africa and radiated from there sometime between 60,000 and 100,000 years ago, replacing other hominid species who had migrated earlier, including the Neanderthals in Europe and Homo erectus, who may have persisted in Java until as recently as 27,000 years ago. What was it that enabled our species to prevail over these other large-brained hominids, whose forebears had migrated much earlier? Perhaps the most plausible answer is that they prevailed because of superior technology. But that technology might have resulted, not from an increase in brain size or intelligence, but from a switch from manual to vocal language that allowed them to use their hands for the manufacture of tools and weapons and their voices for instruction.