What is Language?

A language is a system of symbols, generally known as lexemes and the rules by which they are manipulated. The word language is also used to refer to the whole phenomenon of language, i.e., the common properties of languages. Language is commonly used for communication, though it has other uses.

Language Brain Map

Language is a natural phenomenon, and language learning is common in childhood. In their usual form, human languages use patterns of sound or gesture for the symbols in order to communicate with others through the senses. Though there are thousands of human languages, they all share a number of properties from which there are no known deviations. There is no defined line between a language and a dialect, but it is often said that a language is a dialect with an army and a navy.

Humans have also constructed other languages, including Esperanto and Klingon, programming languages, and various mathematical formalisms. These languages are not necessarily restricted to the properties shared by natural human languages.

Some of the areas of the brain involved in language processing: Broca's area, Wernicke's area, Supramarginal gyrus, Angular gyrus, Primary Auditory Cortex.

The information in this article is adapted from Wikipedia and is freely distributable thanks to the GNU Free Documentation License (GNU FDL).


Human Languages

Human languages are usually referred to as natural languages, and the science studying them is linguistics.

Making a principled distinction between one language and another is usually impossible. For example, the boundaries between named language groups are in effect arbitrary due to blending between populations (the dialect continuum). For instance, there are a few dialects of German similar to some dialects Dutch.

Some like to make parallels with biology, where it is not always possible to make a well-defined distinction between one species and the next. In either case, the ultimate difficulty may stem from the interactions between languages and populations. (See Dialect or August Schleicher for a longer discussion.)

The concepts of Ausbausprache, Abstandsprache, and Dachsprache are used to make finer distinctions about the degrees of difference between languages or dialects.

Scientists do not yet agree on when language was first used by humans (or their ancestors). Estimates range from about two million (2,000,000) years ago, during the time of Homo habilis, to as recently as forty thousand (40,000) years ago, during the time of Cro-Magnon man. The nature of speech means that there is almost no data on which to base conclusions on the subject.

The information in this article is adapted from Wikipedia and is freely distributable thanks to the GNU Free Documentation License (GNU FDL).

Properties of Language

Languages are not just sets of symbols. They also contain a grammar, or system of rules, used to manipulate the symbols. While a set of symbols may be used for expression or communication, it is primitive and relatively unexpressive, because there are no clear or regular relationships between the symbols. Because a language also has a grammar, it can manipulate its symbols to express clear and regular relationships between them.

For example, imagine going on a walk with a person who only knew individual symbols, or words. If you saw a dog, he might say, "Dog scare" or "Scare Dog". Although any English speaker would have some notion of what he was talking about, the relationship between the words is unclear. Is he scared of dogs? Or just that dog? Or does he want to scare the dog off? Does he think the dog is scared? But if you respond, "I’m not scared of dogs," the relationship between dog and scare is quite apparent and hence the meaning of the utterance.

Another important property of language is the arbitrariness of the symbols. Any symbol can be mapped onto any concept (or even onto one of the rules of the grammar). For instance, there is nothing about the Spanish word nada itself that forces Spanish speakers to use it to mean nothing. That is the meaning all Spanish speakers have memorized for that sound pattern. But for Croatian speakers nada means hope.

However, it must be understood that just because in principle the symbols are arbitrary does not mean that a language cannot have symbols that are iconic of what they stand for. Words such as meow sound similar to what they represent, but they could be replaced with words such as jarn, and as long as everyone memorized the new word, the same concepts could be expressed with it.

The information in this article is adapted from Wikipedia and is freely distributable thanks to the GNU Free Documentation License (GNU FDL).

Origins of Language

The origin of language (glottogony, glossogeny) is a topic that has been written about for centuries, but the ephemeral nature of speech means that there is almost no data on which to base conclusions on the subject. We know that, at least once during human evolution, a system of verbal communication emerged from proto-linguistic or non-linguistic means of communication, but beyond that little can be said. No current human group, anywhere, speaks a "primitive" or rudimentary language. While existing languages differ in the size and subjects covered in their several lexicons, all human languages possess the grammar and syntax needed, and can invent, translate, or borrow the vocabulary needed to express the full range of their speakers' concepts.

Homo sapiens clearly have an inherent capability for language that is not present in any other species known today. Whether other extinct hominid species, such as Neandertals, possessed such a capacity is not known. The use of language is one of the most conspicuous and diagnostic traits that distinguish H. sapiens from other animals.

According to one Biblical account, the observed variety of human languages originated at the Tower of Babel with the confusion of tongues. (Image from Gustave Doré's Illustrated Bible).


One of the earliest accounts of the origin of languages is in the Hebrew Bible, in the book of Genesis (dated to the early 1st millennium BC). Genesis 2:19-20 has God giving Adam the task of assigning names to all the animals and plants he had in Eden (see nomothete).

The key biblical narrative of the observed linguistic variety is the story that God punished human presumption in building the Tower of Babel (see confusion of tongues) (Genesis 11:1-9). Additionally, Genesis 10:5 tells how, before Babel, the languages of the descendants of Japhet were divided naturally. This is most likely due to the narrative style of Genesis, in which an event was explained following its introduction into the narrative.

Most mythologies do not credit humans with the invention of language, but know of a language of the gods (or, language of God), predating human language. Mystical languages used to communicate with animals or spirits, such as the language of the birds are also common, and were of particular interest during the Renaissance.

History contains a number of anecdotes about people who attempted to discover the origin of language by experiment. The first such tale was told by Herodotus, who relates that Pharaoh "Psamtik" (probably Psammetichus I) caused two children to be raised by deaf-mutes; he would see what language they ended up speaking. When the children were brought before him, one of them said something that sounded to the pharaoh like bekos, the Phrygian word for bread. From this, Psamtik concluded that Phrygian was the first language. King James V of Scotland is said to have tried a similar experiment; his children were supposed to have ended up speaking Hebrew. Both Frederick II of Prussia and Akbar, a 16th century Mughal emperor of India are said to have tried a similar experiment; the children they tried these experiments with did not speak.

The information in this article is adapted from Wikipedia and is freely distributable thanks to the GNU Free Documentation License (GNU FDL).

Anthropological Hypotheses on the Origins of Language

Steven Pinker, following Noam Chomsky and ultimately Immanuel Kant, believes that humans are born with a "language instinct:" a neural processing network that contains a universal grammar that has developed specifically for encoding and decoding human languages.

Derek Bickerton has suggested that the language faculty may have evolved in two major steps. The first is a protolanguage of symbolic representation and verbal and/or gestural signs, and the second formal syntax. Symbolic representation would allow modeling of reality and constructional learning, and, together with some communicative ability, would permit shared learning. Syntax would permit significantly improved precision and clarity in thought and communication.

The evolution of such an inherited trait in the genus Homo may be one thing that explains why anatomically modern humans expanded at the expense of other hominid species in the history of human evolution. Many mainstream theories of human evolution affirm that all current human beings are the descendants of a relatively small population of anatomically modern humans that appeared in Africa less than one million years ago. The development of an inherited gift for language, or its superior attainment over other species of Homo such as Neandertal man, is one possible explanation for the ascendency of anatomically modern humans over other primitive human groups at the time. At least one gene, FOXP2, is claimed to be involved with the development of language.

The information in this article is adapted from Wikipedia and is freely distributable thanks to the GNU Free Documentation License (GNU FDL).

Linguistic Hypotheses on the Origins of Language

A fundamental problem of language origin is the Continuity Paradox: language acquisition apparently only occurs in situations involving pre-existing languages, or at the very least pidgin communication. In the 19th century, philosophers and linguists proposed a number of hypotheses to explain the origin of language, which are noteworthy for their names even if none of them have vanquished their competitors in the battles for scientific credibility. The first such names were coined by Otto Jespersen as a way of deriding the hypotheses as simplistic speculation. Once the names caught on, new hypotheses that have arisen often have been given names with a similar style. It seems unlikely that one hypothesis describes the whole process; more likely, multiple mechanisms described by multiple hypotheses, working together or one after another, contributed to the development of language.

* Ding-Dong

This hypothesis places the origin of human language in onomatopoeia: the various imitative sounds that humans make to mimic the sounds of the world around them. For example in English, boom is the sound of thunder, oink is the sound made by a pig, and tweet is the sound made by a small bird. Of course, many languages contain their own onomatopoeic words (eg. in Basque, ai-ai, which means "ouch-ouch", refers to a knife).

There are several reasons why this hypothesis has not met with universal acceptance, as it does not adequately explain the creation of words for inanimate objects, such as rocks, much less prepositions and other grammatical particles or abstract concepts. Words marked by onomatopoeia are conspicuous and somewhat unusual in most languages. The "ding-dong" hypothesis is therefore not considered to be a complete explanation for the origin of language.

* Bow-wow

Similar to the "ding-dong" hypothesis, this one has humans forming their first words by imitating animal sounds.

Not only do all of the objections involving other sorts of onomatopoeia explanations apply here, it is worthy to note that the names of animal sounds are strongly culturally determined and differ remarkably from one culture to the next, as the article on oink sets forth. It seems difficult to accept that humans learned to speak to one another by talking to the animals.

* Pooh-pooh

According to this hypothesis, the first words developed from sighs of pleasure, moans of pain, and other semi-involuntary cries or exclamations. These vocalisms then became the names of the phenomena that made people say them.

Most of the objections to the "ding-dong" hypothesis apply here also. Such words are found in most languages; they are conspicuous by their preverbal nature and incomplete assimilation into the lexicon. Moreover, they are culturally determined, and themselves show a great deal of arbitrariness.

* Ta-ta

Charles Darwin lent his authority to this hypothesis. According to this, human language represents the use of oral gestures that began in imitation of hand gestures that were already in use for communication. Vilayanur S. Ramachandran's research into synesthesia and sound symbolism would seem to support this hypothesis.

The difficulty with this hypothesis, is that it begs the question: it requires that a fairly sophisticated repertoire of gestures be in place already for humans to imitate with their mouth gestures. It assumes the existence of a language of gestures without explaining how it arose (however, see Nicaraguan Sign Language). At any rate, though sign languages do have somewhat imitative (or iconic) gestures, they also contain quite arbitrary symbols and have vastly different meanings in different human cultures.

One other difficulty with this hypothesis is that hand gestures and facial expressions are useless unless they are seen. That means it must either be daylight, or firelight, and with nothing blocking one's view. For facial expressions, the communicators must also be facing each other. In addition, hand gestures are difficult if the hands are doing something else.

* Uh-oh

According to this hypothesis, human language begins with the use of arbitrary symbols that represent warnings to other members of the human band. It is agreed that one sort of vocal cry means that lions have been spotted in the area, and another one indicates a snake. You holler one thing at your neighbour to warn them, "Don't eat that! It'll make you sick!" and something distinguishable to warn them "Don't eat that! It's mine!"

This hypothesis seems to have the potential to explain the perceived diversity of human speech; obviously the warning cries uttered here are to some measure arbitrary. It is less certain that this hypothesis could explain how more abstract features of human language developed.

* Yo-he-ho

According to this hypothesis, language arose in rhythmic chants and vocalisms uttered by people engaged in communal labour.

This may have more to do with the origins of poetry than with language itself. Sea chanteys, jody calls, and similar work songs all show humans engaged in communal work improvising with their language around the rhythms of their work. It is uncertain from this hypothesis how meanings came to be associated with the vocalisms uttered by the workers.

* Watch the Birdie

This one is associated with ethologist and linguist E. H. Sturtevant. According to this hypothesis, human language became elaborated because humans found selective advantage in being able to deceive other humans. Since exclamations and vocalisms can involuntarily reveal your true mental state, humans learned to feign them in order to deceive others for selfish advantage.

* The Psychedelic Glossolalia Hypothesis

This theory states that speech was inspired by psychoactive fungi. The line of reasoning is thus: A common symptom of tryptamine intoxication is glossolalia, more commonly known as “speaking in tongues”. As the continent of Africa began to dry, grassland savannas opened, forcing humans out of the forests and into the plains where the dung of large herbivores was ubiquitous. Species of tryptamine-bearing fungi like Psilocybe, which live on animal dung, would have been very attractive to human populations seeking a new food source. Regular ingestion of the fungi could, over a long time, have stimulated complex vocalizations that eventually led to communicative speech.

* Non-naturalistic hypotheses of the origin of language

Some people resort to traditional narratives, myths, or legendary history in order to explain the origin of human language.

The information in this article is adapted from Wikipedia and is freely distributable thanks to the GNU Free Documentation License (GNU FDL).


A related question concerns the possibility of linguistic monogenesis, a hypothesis that holds that there was one single protolanguage (the "Proto-World language") from which all other languages spoken by humans descend. The linguists Joseph Greenberg and Merritt Ruhlen have advocated such a position. The reconstruction of such a protolanguage, if it exists, would be the Holy Grail of historical linguistics.

Some have gone as far as to claim that there exist etymological root words that are supposed to exist in all languages; one such claimed universal root is *âkwa, meaning "water". Nicholas Marr contended that the protolanguage had been composed of merely four roots, *sal, *ber, *yon and *rosh to which all other words may be traced.

These suggestions remain extremely controversial; many linguists insist that phonetic laws must first be proposed that explain how these roots took their forms in the "daughter" languages, and in the absence of such explanation they reject the entire hypothesis. For these linguists, there may or may not have been such an original protolanguage; the intervening centuries of linguistic change have obscured any trails needed to recover it.

Biologists do not yet agree on when or how language use first emerged among humans or their ancestors. Estimates of the time frame of its origin range from forty thousand years ago, during the time of Cro-Magnon man, to about two million years ago, during the time of Homo habilis.

Some authorities believe that language arose suddenly, about 40,000 years ago. This is the time period from which we first see cultural artifacts, such as cave paintings and carved figurines. The relatively sudden appearance of these artifacts lead some to speculate that the cultural leap may have been prompted by the development of language which in turn allowed greater creativity to flourish.

Studies of the skulls of Neandertals (approximately 60,000 years ago) indicate that they would not have been capable of the full range of vowels used by modern humans. However, as pointed out by linguist Steven Pinker, a full range of vowels is not necessary for rudimentary speech. Even relatively complicated speech would be possible so long as a sufficient number of distinguishable consonants were in use.

Fossil evidence indicates that the main areas of the brain associated with language (Broca's area and Wernicke's area) may have begun to enlarge as long ago as 1 – 1.5 million years, in Homo erectus. However the most complete fossil erectus (nicknamed Turkana Boy; about 1.5 million years old) appears to have lacked a sufficiently tuned ribcage capable of fine control of speech.

The recently discovered Homo floresiensis' ancestors are assumed to have utilized some kind of seafaring device like a raft to reach the island where H. floresiensis dwelt, furthermore, it would seem probable that this process of colonization was an intentional one, and due to the complexity of such a task, it is suggested that H. floresiensis and its ancestor, mid-late H. erectus, must have possessed some form of language which, albeit primitive, would have been able to convey complex concepts. Analysis of the brain of H. floresiensis suggests intellectual capabilities which were comparable to other humans of that time, that is, also not widely divergent from primitive H. sapiens.

The information in this article is adapted from Wikipedia and is freely distributable thanks to the GNU Free Documentation License (GNU FDL).

Spontaneous Emergence of Grammar

From Romulus and Remus forward, there have been a number of accounts of wolf children or feral children raised by wild animals or out of human contact. These accounts exist mostly in anecdote and hearsay as well; but most of them affirm that these children never learned to speak a language, or learned it imperfectly. There have also been accounts of twins who spoke an unintelligible language only their sibling understood. These cases are better documented; in the 1970s, the Kennedy twins whose given names were "Grace" and "Virginia" called each other Poto and Cabengo; it was determined that their idiosyncratic speech was a deeply altered form of English, with some influence from their grandmother's German. It appeared to be a well-formed language, with rules governing grammar and syntax. Similarly idiosyncratic speech patterns were reported from the twin writers June and Jennifer Gibbons.

Even in the absence of the unusual social lives of twins, many people have found it relatively easy and natural to construct new languages, with lexicons either derived from pre-existing languages, or wholly imagined; the author J. R. R. Tolkien and his several languages of Middle-earth is one well known creator; there are many others. Contact languages spontaneously arise when people speaking dissimilar languages must mingle with each other for sufficient periods. These contact languages, or "pidgins," give rise to "creoles" if they become mother tongues in their own right. All of these creations also bear witness to the fact that the use and acquisition of language is a human trait that can manifest itself spontaneously, without formal instruction, and under adverse circumstances.

The recent development of Nicaraguan Sign Language starting in 1979 seems to be an independent invention of language from scratch. "It's the first and only time that [linguists have] actually seen a language being created out of thin air." - however it may be connected to paralinguistic gestures used by spanish-speaking Nicaraguans as part of their language.

The information in this article is adapted from Wikipedia and is freely distributable thanks to the GNU Free Documentation License (GNU FDL).

Language Taxonomy

The classification of natural languages can be performed on the basis of different underlying principles (different closeness notions, respecting different properties and relations between languages); important directions of present classifications are:

* paying attention to the historical evolution of languages results in a genetic classification of languages—which is based on genetic relatedness of languages,
* paying attention to the internal structure of languages (grammar) results in a typological classification of languages—which is based on similarity of one or more components of the language’s grammar across languages,
* and respecting geographical closeness and contacts between language-speaking communities results in areal groupings of languages.

The different classifications do not match each other and are not expected to, but the correlation between them is an important point for many linguistic research works. (There is a parallel to the classification of species in biological phylogenetics here: consider monophyletic vs. polyphyletic groups of species.)

The task of genetic classification belongs to the field of historical-comparative linguistics, of typological—to linguistic typology.

The information in this article is adapted from Wikipedia and is freely distributable thanks to the GNU Free Documentation License (GNU FDL).