How Language Works/Languages cited

From Wikibooks, open books for an open world
Jump to navigation Jump to search

In order to understand what is possible in human language, it is important to look at examples from languages that are quite different from your first language. In this book most of the examples will come from nine languages, selected to some extent because they are important in the modern world (English, Spanish, Chinese, Japanese) but mainly because they illustrate a wide range of possibilities. The word (or sign) for 'language' in each of these nine languages appears on the left side of the banner at the top of the Table of Contents page of this book. In the banners at the top of other pages, such as this one, you will see these examples as well as others from nine other languages (Hindi, Hungarian, Navajo, Persian, Russian, Samoan, Tamil, Thai, and Yoruba). As you see examples from these nine languages later on in the book, some of them may seem quite exotic to you, but keep in mind that your language probably also seems equally exotic to speakers of those languages. Each language is introduced briefly in this section. Some of the terms used to describe the languages will probably not be familiar to you if you have not read the relevant sections in the book.

Mandarin Chinese[edit]

In terms of both number of native speakers (about 875,000,000) and number of first and second language speakers (about 1,050,000,000), Mandarin Chinese is easily the most widely spoken language in the world. The great majority of these speakers live in the People's Republic of China (mainland China), but many others also live in the Republic of China (Taiwan) as well as other countries where Chinese speakers have settled. Mandarin Chinese is the official language of both mainland China and Taiwan. The word for 'Chinese' in Mandarin, transcribed into Roman using the pinyin system, is zhong¯guo’ hua

(the marks after the syllables, usually placed directly over the vowel symbols. indicate tones).

Mandarin Chinese is one of about 10 languages or dialects that are called "Chinese". Calling them "dialects" emphasizes that they are all spoken in China and that they all have a single written standard (closest to Mandarin); Calling them "languages" emphasizes the large differences in pronunciation, vocabulary, and grammar among them; I will follow this usage here. The Chinese languages belong to the Sino-Tibetan language family, which also includes Burmese, Tibetan, and about several hundred other languages spoken in China, Southeast Asia, and South Asia.

Like other Sino-Tibetan languages, Mandarin Chinese is a tone language; it uses pitch to distinguish words. In Mandarin, each syllable has one of four tones: high level, mid rising, low falling-rising, and high falling. Because Mandarin and other Chinese languages have so many one-syllable words and because the number of possible syllables is relatively small, many words are distinguished only by their tone. Mandarin syllable structure is relatively simple. The only consonants syllable can end in are /n/ and /N/, and there are no consonant clusters at the beginnings of syllables. But Chinese pronunciation is still complicated for second-language learners because of the tones and because there are two separate sets of consonants similar to the English postalveolar consonants, that is, the final consonants in the words reach, ridge, and rash.

Mandarin and other Chinese languages have little or no bound morphology. Each word has a fixed form: verbs do not take prefixes or suffixes showing the tense or the person, number, or gender of the subject. Nouns do not take prefixes or suffixes showing their number or their case. This does not mean that Chinese has no grammar of course; what it means is that Chinese grammar is mainly concerned with how words are arranged to form meaningful sentences. For the second language learner, it also means that ther are no grammatical paradigms to memorize as there are for languages such as Spanish, Hindi, and Amharic.

Chinese and its ancestor languages have been written for over 4000 years, giving them the longest continuous writing tradition of any group of languages in the world. The Chinese writing system uses a complex system of characters that is quite unlike any other writing system. Originally all of the characters were simplifications of pictures designed to represent the meanings of words directly. Later, characters were created that have parts representing elements of meaning or sound. Each character in the modern system represents a single meaningful syllable. To read Chinese, you need to know several thousand of these. If you have Chinese characters on your computer, you can see the Universal Declaration of Human Rights in Chinese here.


In terms of number of native speakers (about 340,000,000), English is among the four most widely spoken languages in the world (English, Spanish, and Hindi are very close in number of native speakers). About 510,000,000 people speak it as a first or second language (second in the world). In terms of its role as a language of international communication, science, and business, English is probably the most important language in the world. Most first-language speakers live in the United States, the United Kingdom, Canada, Australia, Ireland, New Zealand, and South Africa, but English is an official language of many other countries, especially in Africa and the Caribbean.

English is a member of the Germanic group (like German, Dutch, Swedish, Norwegian, Danish, and Icelandic) of the Indo-European language family. A good deal is known about the history of the language. The earliest period of the language, up to about 1100, is called Old English]. Old English is so different from Modern English that we find it almost completely uninterpretable; it is best to treat it as a different language from Modern English. A number of dramatic changes in vocabulary and grammar were initiated by the conquest of England by French-speaking Normans in 1066. The period known as Middle English] begins soon after that and continues roughly till the year 1500, as English long vowels were undergoing the changes known as the Great Vowel Shift. Modern English] is what we speak now.

English has less bound morphology, that is, suffixes and prefixes on words, and more rigid word order than most other Indo-European languages. English has a large number of vowels, about twelve, depending on the dialect, and the vowel system is relatively unstable, changing frequently in the history of the language. The complex phonology makes English a relatively difficult language for adult second-language learners to learn to pronounce.

Each of the major countries in which English is spoken has its own (unofficial) standard dialect, though the standard dialects of the United States and Canada are very similar, and the standard dialects of Australia and New Zealand are also very similar. England and the United States also have a number of regional or social dialects which differ considerably from the standards. Examples in the United States include Southern (spoken in most of the Southeast), Northern Cities (spoken in cities such as Chicago, Detroit, and Milwaukee), New York, and African-American Vernacular English (AAVE). American English dialects differ from each other mainly in pronunciation, but the grammar of some varieties of AAVE is quite distinct.


In terms of number of native speakers (about 350,000,000), Spanish is among the four most widely spoken languages in the world. About 420,000,000 people speak it as a first or second language (fourth in the world). Spanish is the official language of 21 countries, including Spain and 20 countries in the Western Hemisphere. Mexico is the country with the most Spanish speakers. Many first-language speakers of American Indian languages speak Spanish as a second language. In Spanish the language is called español.

Spanish is a member of the Romance group (like French, Italian, Portuguese, Catalan, and Romanian) of the Indo-European family. The Romance languages all descended from Latin. They all have more bound morphology, that is, prefixes and suffixes, than English, especially on verbs. Spanish verbs indicate tense as well as the subjects of the verbs with suffixes. Like most other Indo-European languages, Spanish has grammatical gender: nouns are either masculine or feminine, and words that modify them must agree with them in gender. The dominant word order of Spanish is subject-verb-object, but Spanish is much more flexible than English in this regard, and it is not unusual for the verb to appear before the subject.

Spanish has a simpler phonological system than the other Romance languages, including only five vowels. This makes Spanish a relatively easy language for adult second-language learners to learn to pronounce.

The standard dialects in Spain and in the Americas differ in a number of ways, in much the same way as the standard English of England and the United States differs. Within Spain and within the Americas, there are also many regional dialects; some of these, such as Mexican and Argentine Spanish, are quite distinct and are treated as (informal) national standards.

Spanish is written using the Roman alphabet, included accented vowel characters to mark stress and distinguish some words from one another. Here is the Universal Declaration of Human Rights in Spanish.


In terms of numbers of native speakers (about 125,000,000), Japanese is about the eighth most widely spoken language in the world. The great majority of the speakers live in Japan, where almost everyone speaks Japanese. In Japanese the language is called nihongo.

There is disagreement on which languages Japanese is related to. It is only clear that its closest relatives are the Ryukyuan languages spoken in the very south of Japan. Some believe Japanese is related to Korean; some believe it and Korean are related to the Altaic language family (Turkish, Mongolian, etc.).

Japanese has extensive bound morphology on its verbs, almost all of it suffixes, though, unlike in Indo-European languages, the subject is not marked on the verb. Japanese is a verb-final language and shares many syntactic properties with other verb-final languages. Japanese is also known for the many devices for expressing level of formality in the language (both vocabulary and grammar) and for the differences between men's and women's speech.

Japanese has a relatively small set of phonemes, so second-language learners usually find it relatively easy to learn to pronounce, but it makes use of pitch to distinguish words from one another, and this system is often not mastered, or even attempted, by second-language learners.

Alongside Standard Japanese, which all children learn in school, there are many regional dialects, differing very greatly from one another (much more than the dialects of American English, for example).

Japanese is written using a combination of Chinese characters (kanji) and two related syllabic systems (hiragana and katakana). Despite this complexity, the Japanese are among the most literate people in the world.

Here is the Universal Declaration of Human Rights in Japanese.


In terms of numbers of native speakers (about 18,000,000), Amharic is approximately the fiftieth most widely spoken language in the world. Almost all of the native speakers live in Ethiopia, where they make up about 1/3 of the population. Because Amharic is the official language of Ethiopia, taught in all schools, it is also spoken as a second language by many other Ethiopians. In Amharic the language is called amarinya.

Amharic has seven vowels, and its consonants include a set of ejectives, unlike anything in a language such as English. For example, the consonants /t/, /d/, and /t'/ need to be distinguished. This makes it a somewhat difficult language for second-language speakers to pronounce.

Amharic belongs to the Semitic group (like Arabic and Hebrew) within the Afro-Asiatic family of languages. About 50 other Semitic and non-Semitic Afro-Asiatic languages are spoken in Ethiopia alongside Amharic.

Like other Semitic languages, Amharic has a very elaborate verb morphology. An Amharic verb root consists of a set of (usually three) consonants. Depending on the tense, and other grammatical features, the consonants may be separated by particular vowels and possibly geminated (doubled). A verb form normally also has one or more suffixes and possibly one or more prefixes as well, agreeing with the subject and sometimes the direct or indirect object of the verb. Complicating things further (at least for the adult second-language learner), there are at least ten different classes of verbs, each modifying its stem in a different way for the different forms. Like Japanese, Hindi, and many other languages, Amharic is a verb-final language. Amharic nouns are relatively simple by comparison, though they may take suffixes indicating possession ('my', 'his', etc.), plural, and a few other grammatical functions.

Unlike most African languages, Amharic has been a written language for many years, at least 500. It is written using a syllabic writing system that is unique to Ethiopian Semitic languages. Compared to other African languages, Amharic has a fairly sizable written literature. Here is the Universal Declaration of Human Rights in Amharic.


Lingala is the first language of about 300,000 people in the two neighboring countries called Congo in central Africa, but is a second language for many more, at least 7,000,000 people, making it one of the most important languages in central Africa. In Lingala, the name of the language is lingála (the accent mark indicates a high tone on that syllable).

Lingala belongs to the Bantu group of languages, like most of the languages of central and southern Africa. Other important Bantu languages include Swahili, Zulu, and Shona. The Bantu languages belong in turn to the Niger-Kordofanian language family, which includes most of the more than 2000 languages spoken in Africa. Lingala is a very recent language, an example of a contact language, a language which emerges out of the contact between people speaking different languages. Lingala began along the Congo River, first as a trade language, later, when Congo was colonized by Belgium, as a way for the colonists and their intermediaries to communicate with the local people. As usually happens in such situations, the resulting language borrowed words and grammatical features from a number of existing languages, in this case Bantu languages of the region. At first it was not the first language of anybody, but with the growth of cities and marriage between speakers of different first languages, it soon became a language learned by children as a first language. In recent years, the language has spread rapidly, partly because of its role in government, partly because it is the usual language in which the enormously popular Congolese music is sung.

Lingala has relatively few phonemes and simple CV syllable structure. Like most other Bantu languages, it is a tone language, using pitch both to distinguish words from each other and to mark grammatical categories.

When a contact language emerges, the grammar (at least the morphology) of the source languages is normally simplified. This is true for Lingala too, though it preserves many of the features of Bantu morphology. Like other Bantu languages, Lingala divides nouns into about 14 classes, similar to the genders of European languages, each with its own characteristic prefix. Verbs have prefixes, suffixes, and particular tone patterns, to indicate the subject and the tense, and to derive a range of other forms from each verb. The basic word order is subject-verb-object.

Lingala is written in the Roman alphabet, but there is not yet an extensive literature. Most educated Congolese still use French for writing. Here

is the Universal Declaration of Human Rights in Lingala.


Tzeltal is spoken by about 215,000 people in Chiapas state in Mexico. Many of the people are bilingual in Spanish. Unlike many other American Indian languages, Tzeltal seems to have a chance of surviving, but this will depend on Mexican language policies, which in the past have not favored languages other than Spanish.

Tzeltal is a member of the Mayan family of languages, currently spoken by several million people in southern Mexico, Guatemala, and Belize. These people are the descendants of the great Maya civilization that flourished in this region 1000 years ago.

Like Amharic, Tzeltal phonology includes a set of ejective consonants. Tzeltal syllables are relatively simple, except that clusters like /hp/ can occur at the beginnings and ends of words.

Tzeltal is known for its elaborate system of classifiers, morphemes which must follow numerals when objects or actions are counted. The choice of classifier usually depends on the shape or structure of the counted objects. One researcher has counted more than 500 classifiers in one Tzeltal dialect. Tzeltal is also known for the way it expresses spatial relations ('on', 'under', 'left of', etc.). It makes use of body part terms ('head', 'back', 'rump', etc.) and a set of spatial orientation verbs for close relations and absolute frame of reference (roughly 'east', 'west', etc.) for more distant relations. The most common word order is verb-object-subject.

The dialects of Tzeltal differ considerably from one another, and there is no commonly accepted standard dialect.

Ancient Mayan was written using a hieroglyphic system, but Tzeltal had no writing system till very recently. It is written today using the Roman alphabet. There are few materials written in it, however; Spanish is the usual written language for literate Tzeltales. Here is the Universal Declaration of Human Rights in Tzeltal.


Inuktitut is spoken by about 14,000 people in northeastern Canada, mainly in the newly formed territory of Nunavut, for which it is one of the official languages. Inuktitut is one of about eight languages in the Eskimo group of languages within the Eskimo-Aleut family. Eskimo languages are spoken in a large, sparsely populated region extending from the far eastern end of Siberia in Russia, across Alaska and northern Canada to Greenland. Unlike most other indigenous languages of North America, Inuktitut seems to have a good chance of surviving.

Inuktitut has a relatively simple set of phonemes. For a given place of articulation, for example, velar, the language makes no distinction between voiceless (English /k/) and voiced (English /g/) stops. However, Inuktitut has a place of articulation not used at all in English but also used in Tzeltal and other Mayan languages: uvular consonants are produced with the tongue making contact or approaching the top of the mouth even further back than for velar consonants.

Inuktitut and other Eskimo languages have some of the most elaborate morphology of any languages in the world. A single Inuktitut word often corresponds to an entire sentence in English. Here is an example of a word consisting of eight morphemes: Pariliarumaniralauqsimanngittunga 'I never said I wanted to go to Paris'. While Inuktitut does not distinguish gender anywhere in its grammar, it distinguishes three numbers, singular, dual, and plural. Each noun can have scores of forms because it must be marked for number and for case. Most verbs can have hundreds of forms because they must be marked for their subject and their direct object, if there is one, as well as for their tense. In some cases, verbs have a special set of suffixes when a question is being asked. In addition to the three persons that other languages of the world distinguish, Inuktitut has a "fourth person" for cases when two third person clauses within the same sentence have a different subject.

Inuktitut is usually written using a modification of a script that was originally designed by missionaries for the Cree language. In this writing system, each simple consonant-vowel syllable is represented by a single symbol. Here is the Universal Declaration of Human Rights in Inuktitut written in the syllabic script.

American Sign Language[edit]

American Sign Language (ASL) is the primary language of between 100,000 and 500,000 people out of the approximately 2,000,000 deaf people in the United States and Canada, but many other deaf people, and some hearing people, have some competence in the language.

ASL is one of hundreds of deaf sign languages in the world, none of them related to spoken languages, but some recent research has shown that sign languages may share some properties with one another even when they are not genetically related. British Sign Language is a separate language, not mutually intelligible with ASL. ASL developed in the schools for the deaf in the US begun in the early part of the 19th century. Because the teaching in these schools was influenced by early French methods of educating the deaf, many ASL signs derive originally from French Sign Language. However, ASL and French Sign Language are quite different languages today.

ASL is to be distinguished from Signed English, which uses English word order, replacing the English words with signs. The grammar of ASL is very different from that of English. Like Tzeltal and Japanese, but not English or Spanish, ASL makes use of classifiers, morphemes that are added to signs to agree with the shape or other physical property of some object. For example, expressions showing location ('be at') take a handshape that conveys something about the thing that is in the location (that is, whether it is a vehicle, an animal, etc.). As in other sign languages, there is a strong tendency for signs in ASL to be iconic, that is, to be motivated by their meanings rather than completely arbitrary. This iconicity extends into the grammar. For example, in ASL participants in a discourse are usually assigned places in the signing space and then pointed to later on when they are referred to, that is, where pronouns would appear in a spoken language.

There is currently no agreed on way of writing ASL, but ASL does have its own literature, including poetry.