Linguistics/Orthography

From Wikibooks, open books for an open world
< Linguistics
Jump to: navigation, search
Linguistics:

GlossaryAppendix AFurther readingBibliographyLicense
00. Introduction01. Phonetics02. Phonology03. Morphology04. Syntax
05. Semantics06. Pragmatics07. Typology08. Historical Linguistics09. Orthography
10. Sociolinguistics11. Psycholinguistics12. Evolutionary Linguistics13. Language Acquisition
14. Creole Languages15. Signed Languages16. Computational Linguistics

Introduction[edit]

Quick: take a look around yourself and see how many examples of writing are within view.

While languages have been in use for tens of millenia, writing has had a comparatively short lifespan.

An individual unit of writing is known as a glyph. Similarly to the concept of the phoneme in phonology, we can speak of graphemes and allographs in orthography. A grapheme is a fundamental unit of written language, and an allograph is any acceptable instantiation of the grapheme in writing. It is standard linguistic practice to enclose graphemic representations in angular brackets <>.

Directionality[edit]

The most common directions for written language are horizontal rows of left-to-right (ltr) or right-to-left (rtl) text. Examples of languages written left-to-right include English, Hindi, and Thai. Examples of right-to-left writing include Arabic, Hebrew, and N'ko.

Vertical writing systems are also found, usually going from top-to-bottom (though bottom-to-top is attested). These vertical rows may be ordered either left-to-right or right-to-left. The Mongolian script (not in much use anymore) is a vertical script ordered left-to-right, while Japanese may be written either horizontally from left-to-right or in vertical columns from right to left. Bottom-to-top writing includes some instances of the Berber script Tifinagh.

In rare cases writing systems may use combinations of these directions. The term boustrophedon (meaning 'ox-turning' in Greek) generally refers to horizontal scripts in which lines alternate for directionality, and the characters themselves may be mirrored.

Types of spelling systems[edit]

Phonemic, morpho-phonemic, defective, complex

Alphabets[edit]

chart of consonants in the Abkhaz Uslar alphabet
Alphabets treat consonants and vowels equally.

Given that you are reading this book, you are probably already familiar with at least one alphabet. The concept underlying alphabets is essentially that they represent the sounds of a language rather than the meaning, and that consonants and vowels are treated in the same way, i.e. the letters <a,e,i,o,u> are not different in any fundamental way from the letters <b,c,d,f,g...>.

One might say that idealized alphabets assign one letter per segment, whether consonant or vowel. In reality many alphabets deviate from this somewhat – for instance, in English we use the digraph (two-letter combination) <sh> to represent the single sound /ʃ/.

Abjads[edit]

The first five letters of the Phoenician abjad
Abjads mostly only represent consonants.

An abjad (also known as a consonentary or a consonental alphabet) is a writing system where words are generally written only with characters for consonants. The Modern Hebrew writing system employs an abjad; for example, the Modern Hebrew word /zanav/ 'tail' is written <זנב> (right-to-left), which would correspond to <znv> in English. Characters for certain certain vowels may also be used, and system of vowel marking—or vocalization—may be available for use but not usually employed.

Abjads are a less common form of writing system; however, most modern alphabets as well as many other writing systems have descended from the Proto-Canaanite alphabet used in the late bronze age. The most prominent abjads still in use today are the Arabic and Hebrew writing systems.

Abugidas[edit]

Brāhmī "ka" and its alternate forms for other vowels
Abugidas use glyphs for consonants which are modified depending on the following vowel.

An abugida (or alphasyllabary) is similar to an abjad in which vocalization is obligatory – consonants are the basic graphemic units, but they are modified (usually with diacritics) depending on what vowel comes after them. Vowels without preceding consonants may be written either with separate independent graphemes, or by using the usual vowel diacritics on a null consonant glyph. Examples of languages which use abugidas include Hindi, Thai, and Amharic.

In many abugidas consonants have a "default" vowel which is assumed to follow them if they are not marked for any other vowel, and they may be marked with a special symbol (known as a virama, from Sanskrit) or otherwise modified to suppress the inherent vowel. For example, in the Devanagari script, a script used to write a number of Indic languages including Hindi and Marathi, any consonantal grapheme in isolation is presumed to be followed by the vowel /a/. If any consonant is followed by another consonant without an intervening vowel, the cluster is written as a ligature (a glyph composed of multiple graphemes combined together).

Syllabaries[edit]

No Parking traffic sign in Cherokee syllabary and English.
Syllabaries assign glyphs to syllables.

A syllabary assigns each syllable to one grapheme. Languages that use syllabic writing include Japanese (in one script), Cherokee, and Yi. Languages with syllabaries tend to have simpler phonotactics, since more possible syllables translates to a larger inventory of graphemes.

Syllabaries are distinct from abugidas in that similar syllables (e.g. ga and gi) are not necessarily related to each other systematically.

Logographies[edit]

Pendant bearing the Sumerian logogram EN, meaning "lord" or "master"
Logographies use glyphs to represent morphemes.

A logography (or logographic script) is a writing system in which graphemes generally represent words or morphemes rather than sounds. Individual characters in a logographic script are known as logograms.

Logographic scripts are not necessarily ideographic or pictographic. For instance, Chinese characters are not always ideographic since they may sometimes be used purely for phonetic content, and are usually have opaque derivation rather than being transparently pictographic.

The term hieroglyphs or hieroglyphics may be used to refer to logograms, but it is more often used to refer specifically to Ancient Egyptian.

Isolated logograms may be used in non-logographic scripts. For instance, the numerals and mathematical symbols used in English writing are logograms—1 one, 2 two, + plus, = equals, and so on. In English, the ampersand & is used for and and et (such as &c for et cetera), % for percent, $ for dollar, # for number, € for euro, £ for pound, etc. Note that logograms such as 1 are rarely used for phonetic value – one would not write something like "The team 1 the soccer game."[1] Also note that 1 may have different phonetic value depending on context: c.f. 1 "one" and 1st "first".

Mixed scripts[edit]

Some languages use multiple types of script in writing. Such a form of writing may be known as a mixed script. Examples of mixed scripts include Egyptian Hieroglyphs and Japanese writing.

Unwritten language and new orthographies[edit]

In the past many non-native linguists created defective orthographies for previously unwritten languages, failing to mark important features such as tone and vowel length which they could not distinguish themselves, and sometimes marking unimportant allophonic detail.

Many contemporary new orthographies are modeled after the IPA. For instance, many languages in parts of Africa have the 7-vowel system /a ɛ e i ɔ o u/, and their recently-introduced orthographies commonly make use of the graphemes <ɛ, ɔ>.

Workbook section[edit]

Exercise 1: Nēhinawēwin[edit]

Swampy Cree (Nēhinawēwin) is a dialect of Cree, an Algonquian language spoken in Manitoba and Ontario. It is one of a number of Canadian languages which uses a writing system which is a type of the collection of related orthographies known as Canadian Aboriginal syllabics (or just syllabics). The following is an Swampy Cree inscription from Winnipeg, written in syllabics. Using the transcription of the text in the Latin alphabet, given below, determine what rules govern characters in Cree syllabics, and what type of script you think it should be considered as.

inscription in Swampy Cree

Êwako oma asiniwi mênikan kiminawak
ininiwak manitopa kaayacik. Êwakwanik oki
kanocihtacik asiniwiatoskiininiw kakiminihcik
omêniw. Akwani mitahtomitanaw askiy asay
êatoskêcik ota manitopa.

Notes[edit]

  1. Exceptions to this can be seen in forms of SMS and IM shorthand, but practical limitations on character counts or manual exertion create extra-linguistic motivating factors.