All of the sounds of Na’vi occur in human languages. However, there are some peculiarities in their combination. Na’vi lacks voiced stops like [b d ɡ] even though it has the voiced fricatives [v z]; more prominent than such intentional gaps though are its ejective stops [pʼ tʼ kʼ], spelled px tx kx, which are novel to most English speakers. Na’vi also has the syllabic consonants ll and rr in addition to its seven simple vowels. Although the sounds were designed to be pronounceable by the human actors of the film, there are unusual consonant clusters which can be difficult, as in fngap [fŋap] "metal" and tskxe [tskʼɛ] "rock".
The fictional Na’vi language of Pandora is unwritten. However, the actual (constructed) language is written in the Latin alphabet. The movie scripts were written in a slightly anglicized orthography for the actors of Avatar, with ng, ts for Frommer's preferred g, c.
Typical Na’vi words include zìsìt "year", fpeio "ceremonial challenge", nì’awve "first", muiä "be fair", tireaioang "spirit animal", kllpxìltu "territory", uniltìrantokx "avatar".
Altogether, Na’vi has thirteen vowel-like sounds. These include seven simple vowels:
|high||i [i]||u [u]
as well as four diphthongs: aw [au̯], ew [ɛu̯], ay [ai̯], ey [ɛi̯], and two syllabic consonants: ll [l̩] and rr [r̩], which mostly behave as vowels.[note 1] The u varies between [u] and [ʊ]; it's the former in open syllables such as tute 'person' and unil 'dream'; it may be either in closed syllables such as tsun 'be able to' and tsmuk 'sibling'.
Na’vi vowels may occur in sequences, as in the Polynesian languages, Bantu, and Japanese.[note 2] Each vowel counts as a syllable, so that ’eoioa "ceremonious" has five syllables, /ˈʔɛ.o.i.o.a/. The syllabic consonants may also occur in sequence with a simple vowel or diphthong, as in hrrap /ˈhr̩.ap/ "dangerous".
Comparison with the vowels of English 
Most of the vowels occur in English. The ä e ì i ey ay aw are pronounced as General American and RP bat, bet, bit, marine, obey, kayak, and cow. The u varies between put and flute. The a, o, and ew sounds do not occur in these dialects. A is the central vowel of Australian, Scottish, and Welsh father, or of New York lock, and like a French or Spanish a. For RP and GA speakers, it's closest to the a of father; speakers in southern England and eastern New England who do not rhyme father with bother have the Na’vi a in father. O is the pure vowel of Scottish and Irish English no or Australian and South African English bought, like a Spanish o or, even closer, French eau and Italian come.[note 3] The ew is equivalent to the eu in Spanish Europa and the el in Brazilian mel "honey". An English approximation is "oh!" in exaggerations of the Queen's English by American comedians such as Carol Burnett. The syllabic consonants behave as vowels, as in plltxe [pl̩.tʼɛ] "to speak" and prrte’ [pr̩.tɛʔ] "pleasure". The rr is strongly trilled, like Spanish rr, but forming a syllable of its own, like an imitation of a cat's purr. The ll is similar to the syllabic le of bottle, but is "light", as in leap or as in Irish English, not "dark" as GA and RP syllabic l is.[note 4]
Which English word you associate with which vowel will depend on your dialect. For example, if you're Canadian, Na’vi e will be like the vowel in bet. However, if you're a New Zealander, it will be closer to your pronunciation of bat. If you're from London, the u varies between the vowels of flute and put. However, if you're Australian, flute will not be a good approximation, and it may be best to stick with put.
|i||[i]||marine (in all major English dialects)|
|u||[u] or [ʊ]||flute or put||put||put||put||flute or put||—||flute or put|
|ew||[ɛw]||— (like eew!, but starting with an [ɛ] sound)|
|ll||[l̩]||— (syllabic as in bottle, but ‘light’ as in leap or as in Irish English)|
|rr||[r̩]||— (syllabic as in US church, but trilled as in Welsh English)|
A tilde (~) indicates that the word is only an approximation of the Na’vi pronunciation. A dash (—) indicates that there is no good approximation in this dialect. A question mark (?) indicates that available sources did not supply a good approximation, but one might exist.
Na’vi does not have vowel length or tone, but it does have contrastive stress: tute [ˈtutɛ] "person", tute [tuˈtɛ] "female person", or täftxuyu [tæ.ˈftʼu.ju] "weaver", täftxuyu [tæ.ftʼu.ˈju] weaves (formal), like the difference between English billow and below. Although stress may move with derivation, as here, it is not affected by inflection (case on nouns, tense on verbs, etc). So, for example, the verb lu "to be" has stress on its only vowel, the u, and no matter what else happens to it, the stress stays on that vowel: lamu [laˈmu] "was", lamängu [lamæˈŋu] "was (negative speaker attitude)", etc. Although case affects the pronouns that are based on oe "I", most affixes do not affect the stress of other nouns or pronouns. For example, from nga "you", there is nìaynga [nɪ.ai̯.ˈŋa] "like you all"; from lì’u [ˈlɪ.ʔu] "word" there is aylì’ufa [ai̯.ˈlɪ.ʔu.fa] "with the words".
There are twenty consonants. There are two Latin transcriptions: one that more closely approaches the ideal of one letter per phoneme, with the letters c and g for [ts] and [ŋ] (the values they have in much of Eastern Europe and Polynesia, respectively), and a modified transcription used for the actors, with the digraphs ts and ng used for those sounds. In both transcriptions, the ejective consonants are written with digraphs in x, a convention that may be unique to Na’vi, though Nambikwara uses tx, kx for similar if not identical sounds.
|Ejective||px [pʼ]||tx [tʼ]||kx [kʼ]|
|Plosive||p [p]||t [t]||k [k]||’ [ʔ]|
|Affricate||ts (c) [ts]|
|Nasal||m [m]||n [n]||ng (g) [ŋ]|
|y [j]||w [w]|
The combination of ejective plosives and voiced fricatives, but no voiced or aspirated plosives, is unusual in human language, but does occur in the Kamchatkan language Itelmen.
In syllable-final and word-final position, p, t, k have no audible release, [p̚ t̚ k̚], as in Malay, Cantonese, and other languages of Southeast Asia. Thus a t followed by an s in the next syllable is not equivalent to ts, and so remains ts rather than c in Frommer's preferred orthography: fìzìsìtsre [fɪ.ˈzɪ.sɪt̚ .sɾɛ] (not *fìzìsìcre [fɪ.ˈzɪ.sɪ.t͡sɾɛ]) "before this year".[note 6]
Comparison with the consonants of English 
The plosives p t k and the affricate ts are tenuis, as in Spanish or French. Most English dialects have aspirated consonants in words like pie, tie, kite, which if imitated would result in a strong foreign accent. Na’vi p, t, k are instead like the sounds in English spy, sty, sky.[note 7]
Stops without audible release, such as Na’vi final p, t, k, occur in English in words such as aptly, at least, actor. However, some English dialects also have such sounds in word-final position, as Na’vi does, especially in casual speech.[note 8]
The glottal stop, written with an apostrophe, is the catch in the middle of the word uh-oh!; some people also use it for the apostrophe in Hawai’i. Cockney English is well known for using a glottal stop for t in words like bottle. This is the effect that the name Na’vi should have: two syllables, not three. What makes the glottal stop difficult is that it may begin words: ’eveng is "a child", eveng "children". In languages which have this distinction, such as Arabic, a glottal stop in initial position is much more forceful than it is in uh-oh, and may sound like a tiny cough.
The r is flapped, as in much of Irish and Scottish English, as well as in Malay and in Spanish pero "but". It sounds a bit like the tt or dd in the American pronunciation of the words latter, ladder.
Na’vi ng and ts (g and c) are common in English in words such as cats and sing (not finger!). However, in Na’vi they may occur at the beginning of a word, as in tsa "that" and nga "you".[note 9]
Syllable structure 
Na’vi syllables may be as simple as a single vowel, or as complex as skxawng "moron" or fngap "metal", both double-consonant–vowel–consonant (CCVC).
The fricatives and the affricate, f v ts s z h, are restricted to the onset of a syllable; the other consonants may occur at either the beginning or at the end.[note 11] However, in addition to appearing before vowels, f ts s may form consonant clusters with any of the unrestricted consonants (the stops and liquids/glides) apart from ’, making for 39 possible clusters at the beginning of a syllable, as in ayskxawng /ai̯.ˈskʼau̯ŋ/ "morons" or lefngap /lɛ.ˈfŋap/ "metallic". Other sequences occur across syllable boundaries, such as na’vi /ˈnaʔ.vi/ "person", ikran /ˈik.ɾan/ "banshee", and atxkxe /atʼ.ˈkʼɛ/ "land".[note 12]
When a consonant that could form either an onset on a coda appears between vowels, it is normally the onset of the following syllable. Atokirina’, for example, is a-to-ki-ri-na’. However, there are exceptions: mimetic kxangangang "boom!" (crack of thunder) is kxang-ang-ang, as the second and third syllables are echoes of the first. In careful enunciation, syllable divisions sometimes follow the morphology of a word. For example, ayoe "we" is formed from the plural prefix ay- and the pronoun oe "I"; and in careful speech it may be syllabified ay-o-e [ai̯ˈoɛ]. However, in rapid speech the default consonant-vowel pattern takes over and it is pronounced a-yo-e [aˈjoɛ], and in most words the default CV.CV pattern takes over even in careful speech: Verbal VC infixes are apparently always divided between syllables, as V.C, for example in so-li and sä-pi, from si "do". There are a few root roots with a distinction between a diphthong followed by a vowel (VC.V) and a simple vowel followed by y or w plus the vowel (V.CV); for instance, tswayon "fly" contains the diphthong ay, tsway-on, whereas layon "black" and irayo "thank you" do not: la-yon, i-ra-yo. The distinction is perhaps not very robust, but it is noted in the dictionary.
Not all vowels are created equal. Whereas the seven simple vowels and four diphthongs occur in any type of syllable, the syllabic consonants only occur in consonant-vowel syllables, as in vrrtep (vrr-tep) "demon". Nouns ending in a diphthong or a syllabic consonant also take the case endings used after consonants, not those used after the simple vowels. In addition, two identical simple vowels may not occur in a row. That is, *me-e-vi and *a-a-pxa are not found; they reduce to mevi and apxa.
Sound change 
The most notable form of sound change in Na’vi is a kind called lenition. This is a weakening that the plosive consonants undergo after certain prefixes and prepositions, as in Irish. In this environment, the ejective plosives px tx kx become the corresponding plain plosives p t k; the plain plosives and affricate p t ts k become the corresponding fricatives f s h; and the glottal stop ’ disappears entirely. Τhis is basically equivalent to dropping down a row in the consonant chart above.
Because of lenition, the singular and plural forms of nouns can appear rather different. For example, the plural form of po "s/he" is ayfo "they", with the p weakening into an f after the plural prefix ay-, and after the preposition ro "at", tsa "that" takes the form sa. Lenition is also salient in interrogative words, as they each come in two forms based on the interrogative element pe : tupe, pesu "who?", kempe, pehem "do what?", krrpe, pehrr "when?", tsengpe, peseng "where?".[note 13]
The nasal consonants m, n, ng tend to assimilate to a following stop, so that tìng mikyun "to listen" (lit. "give an ear") is generally pronounced as if it were tìm mikyun, tìng nari "to look" (lit. "give an eye") as if it were tìn nari, zenke "mustn't" as zengke, and lunpe "why?" as lumpe.
Vowel sequences consist of dissimilar vowels only. Na’vi does not have vowel length, and this means that derived sequences of similar vowels contract into one. For example, when feminine -e is added to túte "person", the result contracts to tuté "female person", with the only difference being stress placement. Similarly, the dual number me- of eveng "children" contracts to meveng "two children". On the other hand, when two i's come together in the approbative inflection of si "to do" in ngaru irayo s‹ei›i oe "I thank you :)", a y is inserted to separate them: Ngáru iráyo seiyí oe. Double consonants may occur at syllable boundaries; however, while the plural (ay-) of yerik "hexapede" is transcribed ayyerik for ease of reading, in pronunciation it is little different from *ayerik.
With the informal pronoun oe "I" and its derivatives, the o reduces to a /w/ sound whenever the stress shifts to the e : Óel /ˈo.ɛl/ "I",[note 14] but oéru /ˈwɛɾu/ "to me" and ayoéng /ai̯ˈwɛŋ/ "all of us".[note 15]
There are other instances of sound change to avoid sequences that don't occur in Na’vi, though the details are not known. For example, the syllabic consonants cannot follow their non-syllabic homologs: though /lr̩/ occurs in lrrtok "a smile", *lll and *rrr are not found. Thus the perfective infix ‹ol› affects the root of plltxe "to say, to speak": p‹ol›lltxe becomes poltxe "spoke".
The vowels of short grammatical words are sometimes elided before a lexical word or phrase that begins with a vowel, at least in song, for instance sì "and" in 's-ayzìsìtä kato' "and the rhythm of the years" and lu "to be" in 'a l-ayngakip' "who is among you"; the same may happen of unstressed vowels of grammatical prefixes, as the ì of nì-’aw "only" in 'han’aw txo' "so (ha) only (nì’aw) if (txo)". These examples fit the meter of a song, but similar things occur in fluent speech, for example 'rä’si!' for rä’ä si! "don't do it!" and 'nayweng' for nìayoeng "like us".
Spoken samples 
There are three online recordings of Frommer speaking extended amounts of Na’vi, which give a good indication of its pronunciation. They can be found in the texts. After reading this Wikibook, you should be able to understand all three.
- It seems that no Terran language has quite these vowels. However, Czech has six of the simple vowel qualities (apart from /æ/), the diphthongs /au̯/ and /eu̯/ (plus /ou̯/), and the syllabic consonants /l̩/ and /r̩/, though the latter two allow for following consonant codas, as in vlk "wolf" and krk "neck", which are not possible in Na’vi.
- For example, Swahili eua "to purify", Japanese aoi "blue/green", Hawaiian aeāea (sp. small green fish) or—with a glottal stop—uauo‘oa "distant voices".
- Note that the e is open-mid while the o is close-mid, and that there is no *oy.
- In the film, syllabic ll is generally pronounced darkly by the actors. That makes it difficult to distinguish ll from u or ul.
- For other countries, such as Jamaica, India, and Malaysia, either the details of English pronunciation were not available to the author, or there was too much variability to make normative statements.
- There do not appear to be any words that are distinguished by this rather subtle contrast, so it will make no effective difference if you do not master it.
- Hold a lit candle or lighter below your lips when you pronounce these words. The flame should flicker or even blow out when you say pie, tie, or kite, but not when you say spy, sty, sky. When speaking Na’vi, the flame should not flicker for pay, tay, kay any more than it does for spay, stay, skay, or for that matter for vay, may, nay.
- These sounds are easy to pronounce. When you say ap, at, or ak, you will cut off the air flowing through your mouth with your lips or tongue. In Na’vi, you simply keep your lips or tongue in that position and turn it into a glottal stop before letting the air flow again.
- This was one of the most difficult aspects of the pronunciation for the actors of Avatar. For tsa, try repeating "cats are" over and over, then drop the "ca" to extract the "tsar." For nga, try repeating "sung all", then drop the "su" to extract the "ngall".
- The gist of the sounds is this: They are pronounced with air pressure from the throat rather than from the lungs. While the tongue or lips seal the mouth so that no air can escape, the Adam's apple is pushed upward, so that when the tongue or lips are released, the air escapes with a pop. Ejective px is more difficult for most people to pronounce than tx or kx.
- Though w y in syllable-final position are considered parts of a diphthong, as they only occur as ay ey aw ew and may be followed by another final consonant, as in skxawng "moron".
- This differs from most European languages, which would syllabify ikran as "i-kran", with a released [ k ], whereas in Na’vi it is ik-ran and the k is unreleased [ k̚ ].
- An exception is glottal stop when it is required before rr or ll, as in ro ’Rrta "on Earth", where glottal stop would normally drop after ro, but can't here because rr cannot begin a syllable. In the case of consonant clusters, it is only the first consonant that undergoes lenition. For instance, the plural of tskxe "stone" is skxe, not *ske, and in the case of tsko "bow", double lenition (*sho) would not be possible, as */sh/ is not a permitted consonant cluster.
- Though in the common greeting oel ngati kameie, the shift occurs in the oel form (now /ˈwɛl/) as well.
- This shift from /o/ to /w/ is blocked in the case of trial inclusive and dual and trial exclusive, because the resulting consonant clusters *mw *pxw would violate Na’vi phonotactics. So "for the two of us[INCL]" is oengaru /wɛ.ˈŋa.ɾu/ with three syllables, but "for the three of us" is pxoengaru /pʼo.ɛ.ˈŋa.ɾu/ with four.
- Boucher, Geoff (November 20, 2009). "USC professor creates an entire alien language for 'Avatar'". Los Angeles Times. http://latimesblogs.latimes.com/herocomplex/2009/11/usc-professor-creates-alien-language-for-avatar.html. Retrieved January 9, 2010.