Language sounds may not be arbitrary

28 Dec

When I teach about language development, I have always told my students that the sounds of language are arbitrary. The fact that the same animal can be “dog” in English and “inu” in Japanese, shows that the sound systems of different languages are just historic accidents.

Well, maybe not. A study published in The Proceedings of the National Academy of Sciences suggests that there may be certain fundamental similarities between certain words across languages:

It is widely assumed that one of the fundamental properties of spoken language is the arbitrary relation between sound and meaning. Some exceptions in the form of nonarbitrary associations have been documented in linguistics, cognitive science, and anthropology, but these studies only involved small subsets of the 6,000+ languages spoken in the world today. By analyzing word lists covering nearly two-thirds of the world’s languages, we demonstrate that a considerable proportion of 100 basic vocabulary items carry strong associations with specific kinds of human speech sounds, occurring persistently across continents and linguistic lineages (linguistic families or isolates). Prominently among these relations, we find property words (“small” and i, “full” and p or b) and body part terms (“tongue” and l, “nose” and n). The areal and historical distribution of these associations suggests that they often emerge independently rather than being inherited or borrowed. Our results therefore have important implications for the language sciences, given that nonarbitrary associations have been proposed to play a critical role in the emergence of cross-modal mappings, the acquisition of language, and the evolution of our species’ unique communication system.

One possible explanation for these similarities is that they are survivals from prot0-world, the hypothetical first human language. But note that the authors argue against this:

The areal and historical distribution of these associations suggests that they often emerge independently rather than being inherited or borrowed

Sensitive periods in teenagers and young adults?

25 Nov

In developmental psychology, “sensitive period” refers to an age range where the the brain is especially sensitive to specific environmental stimuli. The most famous example of this is the sensitive period for language development during early childhood.

Now, a paper in Psychological Science reports on evidence for a sensitive period during adolescence and early adulthood:

In the current study, we investigated windows for enhanced learning of cognitive skills during adolescence. Six hundred thirty-three participants (11–33 years old) were divided into four age groups, and each participant was randomly allocated to one of three training groups. Each training group completed up to 20 days of online training in numerosity discrimination (i.e., discriminating small from large numbers of objects), relational reasoning (i.e., detecting abstract relationships between groups of items), or face perception (i.e., identifying differences in faces). Training yielded some improvement in performance on the numerosity-discrimination task, but only in older adolescents or adults. In contrast, training in relational reasoning improved performance on that task in all age groups, but training benefits were greater for people in late adolescence and adulthood than for people earlier in adolescence. Training did not increase performance on the face-perception task for any age group. Our findings suggest that for certain cognitive skills, training during late adolescence and adulthood yields greater improvement than training earlier in adolescence, which highlights the relevance of this late developmental stage for education.

A (counter-)revolution in linguistics?

14 Sep

This week I will have to introduce my students to language development. This usually involves describing Chomsky’s theory, the standard in all textbooks. However, a serous challenge to Chomsky’s views has begun to emerge. You can read about it here in this Scientific American piece:

At the time the Chomskyan paradigm was proposed, it was a radical break from the more informal approaches prevalent at the time, and it drew attention to all the cognitive complexities in­­volved in becoming competent at speaking and understanding language. But at the same time that theories such as Chomsky’s allowed us to see new things, they also blinded us to other aspects of language. In linguistics and allied fields, many researchers are be­­coming ever more dissatisfied with a totally formal language approach such as universal grammar—not to mention the empirical inadequacies of the theory.

I wonder if there will be any renewed interest in Skinner's ideas on this topic?


Language Facts

12 Sep

I was searching for some information for a lecture on language development and stumbled upon this fascinating webpage. A few of the things I learned:

The language with the fewest sounds (phonemes): Rotokas (11 phonemes)

The language with the most sounds (phonemes): !Xóõ (112 phonemes). Approx. 4200 speak !Xóõ, the vast majority of whom live in the African country of Botswana.

Here is a video about saving Rotokas:



You can hear !Xóõ spoken here:


The website does make one mistake it claims:

Language with the fewest words: Taki Taki (also called Sranan), 340 words. Taki Taki is an English-based Creole spoken by 120,000 in the South American country of Suriname.

Actually,  Toki Pona has only 120 words.



Ann Patty on learning Latin

15 Jul

A fascinating Lexicon Valley podcast where linguist  John McWhorter interviews Ann Patty about her efforts to learn Latin. Patty documents her learning project in her book Living with a Dead Language: My Romance with Latin

As I have said many times, learning a language is an ideal exercise for your brain. Don’t waste you time with expensive and, probably, ineffective brain training software. Learn a language instead.

The Actual Fluency Podcast

6 Jun

It has been suggested to me that my recent posts on Bayesian Analysis might be of limited interest (I am shocked). So today I give you a language learning tip; listen to The Actual Fluency Podcast, hosted by Kris Broholm the show is entertaining, helpful, and inspirational. If you are trying to learn a language, I recommend this podcast without reservation.

Memory and lexical apartheid

6 Apr

While the case for memorization may be clear for learning a second language, what is its role in learning English vocabulary? While it is true that we learn much of our vocabulary from context, rather than explicit instruction , it may be that many English speakers would benefit from direct instruction of English vocabulary.
This is because English is a diglossic language, in the sense that it contains two vocabularies. In a diglossic language, at least two versions of the language exist, each associated with different positions in the social hierarchy. In some cases, such as English, the language contains two vocabularies that reflect social stratification, with one acting as the language of ordinary people and common interaction and the other vocabulary being the words of prestige and power.
A number of languages are diglossic. For example, Hindi-Urdu, sometimes called Hindustani, is a diglossic languages spoken in the Indian subcontinent. The name Hindi-Urdu identifies the two dialects of the same language. Hindi and Urdu share many words and essentially the same grammar. While they have different writing systems, for everyday conversations they are effectively the same and Urdu and Hindi speakers can communicate without difficulty. However, when one wants to discuss topics outside of ordinary interactions, say education, economics, or science, the languages diverge substantially. That is because their higher vocabularies draw on different sources. The higher vocabulary for Hindi comes from the ancient liturgical language of Hinduism; Sanskrit. While Urdu’s higher vocabulary comes from Persian and Arabic.
Arabic is also a diglossic language with an everyday dialect and literary dialect. Research has found that for many Arabic speakers learning the literary dialect is, in some ways, like learning a foreign language. The Arabic of the schools and books is different from the Arabic of home and this may contribute to lower levels of academic achievement.
English also can be said to have two vocabularies both rooted in its historical development. Anglo-Saxon English was established in England by the early Germanic invaders. Latin words were introduced more slowly beginning with the Roman invasion and continuing as a consequence of the spread of Christianity. A major shift occurred with the Norman conquest of England in 1066. The Normans spoke a dialect of French that became the language of the ruling class. This meant that the British aristocracy spoke a Latinate language while the common people spoke Anglo-Saxon English, a Germanic language.
This division still persists in our vocabulary. There is an English that everyone learns to speak, this is the English of everyday interactions and its origins lie in Anglo-Saxon English. There is also an academic English, the English of science, literature, and education. This English is largely Latin and Greek in origin and includes words that were imported into English from the Norman Conquest and, later, during the Renaissance. This difference is illustrated by two great works of English, both written around the same time, the King Jame’s Bible and the works of Shakespeare.
The King James Bible was written in Anglo-Saxon English, and while it was originally published in 1611 it still largely comprehensible to most native English speakers. Indeed, it remains the preferred Bible for many Protestant churches.
Shakespeare, on the other hand, is a Renaissance author and students often find his writing difficult. Many English words borrowed from Latin and Greek are first recorded in the his plays.
Some linguists believe the Renaissance was the biggest period of vocabulary growth in the English language, primarily because of the importation of Graeco-Latinate words.
Educational arrangements in Elizabethan England served to perpetuate class distinctions in language. Schools for the poor and lower classes, when they existed at all, taught only the rudiments of reading and writing in the Anglo-Saxon English, while schools for the children of the elite taught Latin and, sometimes, Greek. Some elite schools required students to speak exclusively in Latin. In the 19th century literature we find a distinction in the use of Latinate words between high and low status characters in the novels of Jane Austen.
David Corson, professor at the University of Toronto, claimed that that English continues to contain two incompatible vocabularies, one Anglo-Saxon the other Graeco-Latinate. The Anglo-Saxon words are used for the concrete while Greek and Latin words reserved used for more abstract discourse. Graeco-Latinate words are used in higher education and specialist vocabularies
Some English speakers, generally those with better educated parents, learn the Graeco-Latinate lexicon from exposure at home. Those who come from homes where only concrete Anglo-Saxon words are used enter school with a real disadvantage. Corson  describes this disadvantage as the “lexical bar” and, even, “lexical apartheid”.
In order to function at the levels required by higher education one must be able to penetrate the Latinate vocabulary of the academy. Our failure to teach this vocabulary, dis-empowers students and locks them out of the central discourse of our culture. Corson  argues that “children’s differences in language ability, more than any other observable factor, affect their potential for success in schooling” . For example, we know that reading comprehension is closely correlated with vocabulary ability. Indeed, the correlation between vocabulary and comprehension is so high that vocabulary tests are good substitutes for comprehension tests. Psychologist Edgar Dale argued that “all education is vocabulary development”.

