Unicode/Versions

From Wikibooks, open books for an open world
Jump to: navigation, search
Unicode Standard
Discussion Character Reference
(edit template)

This page is about each version specification, and the differences between the versions.

Unicode 1.0[edit]

Unicode 1.0 was the first version of Unicode, released October 1991. It encoded 7,161 characters.

“Blocks”[edit]

This version of Unicode did not formally group characters in blocks. But in comparison with version 2.0, the following “blocks” were available:

  • Basic Latin, containing 128 characters.
  • Latin-1 Supplement, containing 128 characters.
  • Latin Extended-A, containing 127 characters.
  • Latin Extended-B, containing 113 characters.
  • IPA Extensions, containing 89 characters.
  • Spacing Modifier Letters, containing 57 characters.
  • Combining Diacritical Marks, containing 66 characters.
  • Greek, containing 112 characters.
  • Cyrillic, containing 192 characters.
  • Armenian, containing 84 characters.
  • Hebrew, containing 52 characters.
  • Arabic, containing 169 characters.
  • Devanagari, containing 104 characters.
  • Bengali, containing 89 characters.
  • Gurmukhi, containing 74 characters.
  • Gujarati, containing 75 characters.
  • Oriya, containing 78 characters.
  • Tamil, containing 61 characters.
  • Telugu, containing 80 characters.
  • Kannada, containing 80 characters.
  • Malayalam, containing 78 characters.
  • Thai, containing 92 characters.
  • Lao, containing 70 characters.
  • Tibetan, containing 71 characters.
  • Georgian, containing 78 characters.
  • General Punctuation, containing 67 characters.
  • Superscripts and Subscripts, containing 28 characters.
  • Currency Symbols, containing 11 characters.
  • Combining Marks for Symbols, containing 18 characters.
  • Letterlike Symbols, containing 57 characters.
  • Number Forms, containing 48 characters.
  • Arrows, containing 91 characters.
  • Mathematical Operators, containing 242 characters.
  • Miscellaneous Technical, containing 45 characters.
  • Control Pictures, containing 37 characters.
  • Optical Character Recognition, containing 11 characters.
  • Enclosed Alphanumerics, containing 139 characters.
  • Box Drawing, containing 128 characters.
  • Block Elements, containing 22 characters.
  • Geometric Shapes, containing 79 characters.
  • Miscellaneous Symbols, containing 106 characters.
  • Dingbats, containing 160 characters.
  • CJK Symbols and Punctuation, containing 56 characters.
  • Hiragana, containing 90 characters.
  • Katakana, containing 90 characters.
  • Bopomofo, containing 40 characters.
  • Hangul Compatibility Jamo, containing 94 characters.
  • Kanbun, containing 16 characters.
  • Enclosed CJK Letters and Months, containing 191 characters.
  • CJK Compatibility, containing 187 characters.
  • Hangul Syllables, containing 2,350 characters.
  • Private Use, reserved for 5,632 characters.
  • CJK Compatibility Forms, containing 28 characters.
  • Small Form Variants, containing 26 characters.
  • Arabic Presentation Forms-B, containing 140 characters.
  • Halfwidth and Fullwidth Forms, containing 216 characters.
  • Specials, containing 1 character.

Unicode 1.1[edit]

Unicode 1.1 was released June 1993. It encoded 34,233 characters, and finalized the long anticipated Han Unification.

New blocks[edit]

  • Hangul Jamo, containing 240 jamos for the Hangul script, was added.
  • Latin Extended Additional, containing 245 precomposed characters for transliteration and Vietnamese, was added.
  • Greek Extended , containing 233 precomposed characters for polytonic Greek, was added.
  • CJK Unified Ideographs, containing 20,902 Han Ideographs for Chinese, Japanese and Korean was added.
  • CJK Compatibility Ideographs, containing 302 Han Ideographs for compatibility with existing character sets was added.
  • Alphabetic Presentation Forms, containing 57 precomposed characters and ligatures, was added.
  • Arabic Presentation Forms-A, containing 593 combinations of Arabic letters, was added.
  • Combining Half Marks, containing 4 halves of diacritical marks, was added.

Removed blocks[edit]

  • Tibetan, containing 71 letters for the Tibetan script, was removed from the Unicode standard.

Extended blocks[edit]

  • The long S (ſ) (total 1 character) was added to Latin Extended-A.
  • The Hungarian Dz, characters for transliteration purposes and precomposed characters with double grave and inverted breve (total 35 characters) were added to Latin Extended-B.
  • Diacritics for polytonic Greek and double width diacritics (total 6 characters) were added to Combining Diacritical Marks.
  • Compatibility characters now deprecated (total 5 characters) were added to Greek and Coptic.
  • Additional characters for non-Slavic languages (total 38 characters) were added to Cyrillic.
  • A ligature of Ech and Yiwn (և) (total 1 character) was added to Armenian.
  • One deprecated compatibility character and several characters for biblical texts (total 25 characters) were added to Arabic.
  • The virama (o੍) (total 1 character) was added to Gurmukhi.
  • The candra O and candra E vowels (total 3 characters) were added to Gujarati.
  • The Ai length mark (oୖ) (total 1 character) was added to Oriya.
  • An undertie, a pair of brackets and six formatting characters (total 9 characters) were added to General Punctuation.
  • Some additional symbols and the complete set of APL functional symbols (total 78 characters) were added to Miscellaneous Technical.
  • A large circle () (total 1 character) was added to Geometric Shapes.
  • The ideographic telegraph line feed separator symbol () (total 1 character) was added to CJK Symbols and Punctuation.
  • Four Katakana letters not in use since 1945 (total 4 characters) were added to Katakana.
  • Ideographic telegraph symbols for the twelve months (total 12 characters) were added to Enclosed CJK Letters and Months.
  • Ideographic telegraph symbols for hours and days and six additional measure units (total 62 characters) were added to CJK Compatibility.
  • Some more space (total 2,304 characters) was added to the Private Use Area.
  • Seven halfwidth geometric shapes (total 7 characters) were added to Halfwidth and Fullwidth Forms.

Unicode 2.0[edit]

Unicode 2.0 was released July 1996. It encoded 38,950 characters, and was the first Unicode version to reserve blocks outside of the Basic Multilingual Plane.

New blocks[edit]

  • Hangul Syllables, containing 11,172 precomposed syllables for the Hangul script, was added.
  • Supplementary Private Use Area-A and Supplementary Private Use Area-B, reserving a total of 131,068 characters for private use, was added.

Renew blocks[edit]

  • Tibetan, containing 168 characters for the Tibetan script including religious signs, was readded.

Extended blocks[edit]

  • The long S with dot above (ẛ) (total 1 character) was added to Latin Extended Additional.
  • The Vietnamese dong (₫) (total 1 character) was added to Currency Symbols.
  • Cantillation marks for use in religious texts (total 31 characters) were added to Hebrew.

Unicode 2.1[edit]

Unicode 2.1 was released May 1998. It encoded 38,952 characters, only 2 characters more than the last version.

Extended blocks[edit]

  • The euro sign (€) (total 1 character) was added to Currency Symbols.
  • The object replacement character () (total 1 character) was added to Specials.

Unicode 3.0[edit]

Unicode 3.0 was released September 1999. It was a big update and encoded 49,259 characters.

New blocks[edit]

  • Syriac, containing 71 characters used for writing in Syriac script, was added.
  • Thaana, containing 49 characters used for writing in Thaana script, was added.
  • Sinhala, containing 80 characters for the Sinhala script, was added.
  • Myanmar, containing 78 characters for the Burmese script, was added.
  • Ethiopic, containing 345 syllables and punctuation marks for the Ethiopic script, was added.
  • Cherokee, containing 85 syllables for the Cherokee script, was added.
  • Unified Canadian Aboriginal Syllabics, containing 630 syllables and punctuation marks for writing in aboriginal languages of Canada, was added.
  • Ogham, containing 29 characters for the ancient Ogham script, was added.
  • Runic, containing 81 characters for the Germanic runes, was added.
  • Khmer, containing 103 characters for the Khmer script, was added.
  • Mongolian, containing 155 characters for the classical Mongolian script, was added.
  • Braille Patterns, containing 256 Braille letters, was added.
  • CJK Radicals Supplement, containing 115 non-Kangxi radicals, was added.
  • Kangxi Radicals, containing 214 radicals from the Kangxi dictionary, was added.
  • Ideographic Description characters, used to describe a Han ideograph not available in the font, was added.
  • Bopomofo Extended, containing 24 characters used for phonetic transcription of minority languages of Taiwan, was added.
  • CJK Unified Ideographs Extension A, containing 6,582 additional Han Ideographs, was added.
  • Yi Syllables, containing 1,165 syllables of the modern Yi script, was added.
  • Yi Radicals, containing 50 radicals of Yi Syllables, was added.

Extended blocks[edit]

  • Additional precomposed characters, letters and capital letters of lowercase-only letters (total 30 characters) were added to Latin Extended-B.
  • Extensions for disordered speech (total 5 characters) were added to IPA Extensions.
  • Some additional modifier letters (total 6 characters) were added to Spacing Modifier Letters.
  • Additional diacritics for IPA notation (total 10 characters) were added to Combining Diacritical Marks.
  • Lowercase versions of archaic letters and the Kai symbol (total 5 characters) were added to Greek and Coptic.
  • Nonstandard letters for Macedonian, combining numeral signs and three letters for Kildin Sami (total 12 characters) were added to Cyrillic.
  • The hyphen (֊) (total 1 character) was added to Armenian.
  • Combining hamza and maddah and nine additional Arabic characters (total 12 characters) were added to Arabic.
  • Additional letters and religious symbols (total 25 characters) were added to Tibetan.
  • A narrow no-break space and 6 additional punctuation marks (total 7 characters) were added to General Punctuation.
  • The Kip, Tugrik and Drachma sign (total 3 characters) were added to Currency Symbols.
  • An enclosing screen and an enclosing key (total 2 characters) were added to Combining Diacritical Marks for Symbols.
  • The information symbol and a rotated Q (total 2 characters) were added to Letterlike Symbols.
  • A mirrored Roman capital numeral hundred (Ↄ) (total 1 character) was added to Number Forms.
  • Some additional arrows (total 9 characters) were added to Arrows.
  • Some additional technical symbols, including common keys on a 101 keyboard (total 32 characters) were added to Miscellaneous Technical.
  • Two additional control pictures (total 2 characters) were added to Control Pictures.
  • Squares and circles with quadrants (total 8 characters) were added to Geometric Shapes.
  • Two Syriac crosses and a signature mark (total 3 characters) were added to Miscellaneous Symbols.
  • Three Hangzhou numerals and a variation indicator (total 4 characters) were added to CJK Symbols and Punctuation.
  • An additional Hebrew ligature (יִ) (total 1 character) was added to Alphabetic Presentation Forms.
  • Three additional control characters for ruby markup (total 3 characters) were added to Specials.

Unicode 3.1[edit]

Unicode 3.1 was released March 2001. It encoded 94,205 characters and mainly focused on blocks outside of the Basic Multilingual Plane.

New blocks[edit]

  • Old Italic, containing 35 letters for the Etruscan script, was added.
  • Gothic, containing 27 letters for the Gothic script, was added.
  • Deseret, containing 76 letters for the artificial Deseret script, was added.
  • Byzantine Musical Symbols, containing 246 symbols for musical notation in Byzantine, was added.
  • Musical Symbols, containing 219 characters for current musical notation, was added.
  • Mathematical Alphanumeric Symbols, containing 991 Latin and Greek letters in serif, sans-serif, bold, italic, double-struck, script and Fraktur, was added.
  • CJK Unified Ideographs Extension B, containing 42,711 additional Han Ideographs, was added.
  • CJK Compatibility Ideographs Supplement, containing 542 additional ideographs for compatibility purposes, was added.
  • Tags, containing 97 language tags, was added.

Extended blocks[edit]

  • The capital Theta symbol and the Lunate Epsilon symbol (total 2 characters) were added to Greek and Coptic.

Unicode 3.2[edit]

Unicode 3.2 was released March 2002. It encoded 95,221 characters.

New blocks[edit]

  • Cyrillic Supplement, containing 16 characters used for the Komi language, was added.
  • Tagalog, containing 20 characters for the Baybayin script, was added.
  • Hanunoo, containing 23 characters and punctuation for the Hanunoo script, was added.
  • Buhid, containing 20 characters for the Buhid script, was added.
  • Tagbanwa, containing 18 characters for the Tagbanwa script, was added.
  • Miscellaneous Mathematical Symbols-A, containing 28 symbols used in math notation, was added.
  • Supplemental Arrows-A, containing 16 additional arrows, was added.
  • Supplemental Arrows-B, containing 128 special arrows, was added.
  • Miscellaneous Mathematical Symbols-B, containing 128 additional mathematical symbols, was added.
  • Supplemental Mathematical Operators, containing 256 additional mathematical operators, was added.
  • Katakana Phonetic Extensions, containing 16 Katakana letters used for Ainu, was added.
  • Variation Selectors, containing 16 symbols used for indicating variations, was added.

Extended blocks[edit]

  • The capital letter N with long right leg (Ƞ) (total 1 character) was added to Latin Extended-B.
  • The combining grapheme joiner and combining Latin letters used in medieval texts (total 14 characters) were added to Combining Diacritical Marks.
  • The Qoppa and a reversed lunate epsilon symbol (total 3 characters) were added to Greek and Coptic.
  • Four additional letters used for the Kildin Sami language (total 8 characters) were added to Cyrillic.
  • A dotless Beh and a dotless Qaf (total 2 characters) were added to Arabic.
  • The letter Naa (ޱ) (total 1 character) was added to Thaana.
  • The letters Yn and Elifi (total 2 characters) were added to Georgian.
  • Some additional punctuation marks and control characters (total 12 characters) were added to General Punctuation.
  • A superscript i (ⁱ) (total 1 character) was added to Superscripts and Subscripts.
  • The old penny sign and the peso sign (total 2 characters) were added to Currency Symbols.
  • Some additional combining characters (total 7 characters) were added to Combining Diacritical Marks for Symbols.
  • Some double-struck and reversed/turned letters (total 15 characters) were added to Letterlike Symbols.
  • Some additional arrows (total 12 characters) were added to Arrows.
  • Some additional mathematical operators (total 14 characters) were added to Mathematical Operators.
  • Variable-width and additional symbols (total 53 characters) were added to Miscellaneous Technical.
  • Black and double circled numerals (total 20 characters) were added to Enclosed Alphanumerics.
  • Quadrant elements (total 10 characters) were added to Block Elements.
  • Some additional triangles and squares (total 8 characters) were added to Geometric Shapes.
  • Shogi pieces ,recycling symbols, dices and dotted circles (total 24 characters) were added to Miscellaneous Symbols.
  • Additional parenthesis (total 14 characters) were added to Dingbats.
  • Three additional marks (total 3 characters) were added to CJK Symbols and Punctuation.
  • A digraph and two additional characters (total 3 characters) were added to Hiragana.
  • A digraph and a double hyphen (total 2 characters) were added to Katakana.
  • Additional circled numerals (total 30 characters) were added to Enclosed CJK Letters and Months.
  • Five missing radicals (total 5 characters) were added to Yi Radicals.
  • Additional compatibility characters (total 59 characters) were added to CJK Compatibility Ideographs.
  • The rial sign (﷼) (total 1 character) was added to Arabic Presentation Forms-A.
  • Two sesame dots (total 2 characters) were added to CJK Compatibility Forms.
  • A tail fragment (ﹳ) (total 1 character) was added to Arabic Presentation Forms-B.
  • A pair of double parenthesis (total 2 characters) was added to Halfwidth and Fullwidth Forms.

Unicode 4.0[edit]

Unicode 4.0 was released April 2003. It encoded 96,447 characters.

New blocks[edit]

  • Limbu, containing 66 characters for the Limbu abugida, was added.
  • Tai Le, containing 35 letters for the Tai Le script, was added.
  • Khmer Symbols, containing 32 symbols for the lunar calendar, was added.
  • Phonetic Extensions, containing 108 letters used in phonetic transcription, was added.
  • Miscellaneous Symbols and Arrows, containing 14 additional arrows, was added.
  • Yijing Hexagram Symbols, containing 64 hexagrams, was added.
  • Linear B Syllabary, containing 88 syllables of the ancient Linear B script, was added.
  • Linear B Ideograms, containing 123 ideograms of the ancient Linear B script, was added.
  • Aegean Numbers, containing 57 numerals used in the Aegean area, was added.
  • Ugaritic, containing 31 characters used in Ugaritic cuneiform, was added.
  • Shavian, containing 48 letters used for the artificial Shavian script, was added.
  • Osmanya, containing 40 characters used in the artificial Osmanya script, was added.
  • Cypriot Syllabary, containing 55 characters formerly used on Cyprus, was added.
  • Tai Xuan Jing Symbols, containing 87 symbols of Tai Xuan Jing, was added.
  • Variation Selectors Supplement, containing 240 additional variation selectors, was added.

Extended blocks[edit]

  • Letters with curl used in Sinology (total 4 characters) were added to Latin Extended-B.
  • Former IPA letters (total 2 characters) were added to IPA Extensions.
  • Some additional characters (total 17 characters) were added to Spacing Modifier Letters.
  • Additional combining double-width diacritics and diacritics corresponding to their spacing equivalent (total 11 characters) were added to Combining Diacritical Marks.
  • The archaic letters Sho and San and the capital Lunate Sigma (total 5 characters) were added to Greek and Coptic.
  • Some additional markers, biblical signs, and letters with inverted V (total 19 characters) were added to Arabic.
  • Letters used for foreign words from Persian and Sogdian (total 6 characters) were added to Syriac.
  • The short A (ऄ) (total 1 character) was added to Devanagari.
  • The Avagraha sign (ঽ) (total 1 character) was added to Bengali.
  • The Adak Bindi and Visarga signs (total 2 characters) were added to Gurmukhi.
  • The vocalic l and ll and the Rupee sign (total 5 characters) were added to Gujarati.
  • The letters Va and Wa (total 2 characters) were added to Oriya.
  • Additional signs for date and finance environments (total 8 characters) were added to Tamil.
  • The Nukta and Avagraha signs (total 2 characters) were added to Kannada.
  • Some symbols and signs (total 11 characters) were added to Khmer.
  • An inverted undertie and a swung dash (total 2 characters) were added to General Punctuation.
  • The facsimile sign (℻) (total 1 character) was added to Letterlike Symbols.
  • The eject symbol and a vertical line (total 2 characters) were added to Miscellaneous Technical.
  • A black circled digit zero (⓿) (total 1 character) was added to Enclosed Alphanumerics.
  • Monograms and diagrams, flags, warning and weather symbols and a cup of tea (total 12 characters) were added to Miscellaneous Symbols.
  • Additional parenthesized and circled Korean characters and supplemental signs (total 9 characters) were added to Enclosed CJK Letters and Months.
  • Additional measure units (total 7 characters) were added to CJK Compatibility.
  • An additional Arabic sign (﷽) (total 1 character) was added to Arabic Presentation Forms-A.
  • A pair of vertical parenthesis (total 2 characters) was added to CJK Compatibility Forms.
  • The letters Oi and Ew (total 4 characters) were added to Deseret.
  • A small script l (ℓ) (total 1 character) was added to Mathematical Alphanumeric Symbols.

Unicode 4.1[edit]

Unicode 4.1 was released March 2005. It encoded 97,720 characters.

New blocks[edit]

  • Arabic Supplement, containing 30 characters for various languages written with the Arabic script, was added.
  • Ethiopic Supplement, containing 26 characters and signs for Sebatbeit, was added.
  • New Tai Lue, containing 80 characters for the New Tai Lue script, was added.
  • Buginese, containing 30 characters for the Lontara script, was added.
  • Phonetic Extensions Supplement, containing 64 additional letters for phonetic transcription, was added.
  • Combining Diacritical Marks Supplement, containing 4 additional diacritics, was added.
  • Glagolitic, containing 94 characters for the Glagolitic script, was added.
  • Coptic, containing 114 characters for the Coptic script, was added.
  • Georgian Supplement, containing 38 Nuskhuri letters, was added.
  • Tifinagh, containing 55 characters for the Tifinagh script, was added.
  • Ethiopic Extended, containing 79 additional Ethiopic syllables, was added.
  • Supplemental Punctuation, containing 26 additional punctuation marks, was added.
  • CJK Strokes, containing 16 strokes for Han Ideographs, was added.
  • Modifier Tone Letters, containing 23 letters for Chinese tones, was added.
  • Syloti Nagri, containing 44 characters for the Syloti Nagri abugida, was added.
  • Vertical Forms, containing 10 punctuation marks suited for vertical text, was added.
  • Ancient Greek Numbers, containing 75 numerals and signs used in Ancient Greek, was added.
  • Old Persian, containing 50 characters for Old Persian cuneiform, was added.
  • Kharoshthi, containing 65 characters for the Kharoshthi abugida, was added.
  • Ancient Greek Musical Notation, containing 70 musical signs used in Ancient Greek, was added.

Extended blocks[edit]

  • Letters for Sencoten, digraphs, letters with swash tail and other additions (total 11 characters) were added to Latin Extended-B.
  • Additional diacritics for transliteration (total 5 characters) were added to Combining Diacritical Marks.
  • Rho with stroke, reversed and dotted Lunate Sigma (total 4 characters) were added to Greek and Coptic.
  • Ghe with descender (Ӷ) (total 2 characters) was added to Cyrillic.
  • An additional biblical mark and some punctuation marks (total 4 characters) were added to Hebrew.
  • Additional biblical marks, punctuation marks and the Afghani sign (total 8 characters) were added to Arabic.
  • A glottal stop (ॽ) (total 1 character) was added to Devanagari.
  • The Khanda Ta letter (ৎ) (total 1 character) was added to Bengali.
  • The letter Sha and the digit zero (total 2 characters) were added to Tamil.
  • Two marks used in Bhutan (total 2 characters) were added to Tibetan.
  • Two letters and a modifier letter (total 3 characters) were added to Georgian.
  • Some additional syllables (total 11 characters) were added to Ethiopic.
  • Additional phonetic symbols (total 20 characters) were added to Phonetic Extensions.
  • A flower and dot punctuation marks (total 9 characters) were added to General Punctuation.
  • Additional subscript letters (total 5 characters) were added to Superscripts and Subscripts.
  • The Guarani, Austral, Hryvnia and Cedi signs (total 4 characters) were added to Currency Symbols.
  • A combining long double solidus (⃫) (total 1 character) was added to Combining Diacritical Marks for Symbols.
  • The per sign and a double-struck letter Pi (total 2 characters) were added to Letterlike Symbols.
  • Metrical and electrical signs (total 11 characters) were added to Miscellaneous Technical.
  • Additional gender and map symbols (total 30 characters) were added to Miscellaneous Symbols.
  • Some additional mathematical symbols (total 7 characters) were added to Miscellaneous Mathematical Symbols-A.
  • Additional arrows and squares (total 6 characters) were added to Miscellaneous Symbols and Arrows.
  • A circled Hangul character (㉾) (total 1 character) was added to Enclosed CJK Letters and Months.
  • Additional Han Ideographs (total 22 characters) were added to CJK Unified Ideographs.
  • Additional Compatibility Ideographs (total 106 characters) were added to CJK Compatibility Ideographs.
  • Italic dotless small i and j (total 2 characters) were added to Mathematical Alphanumeric Symbols.

Unicode 5.0[edit]

Unicode 5.0 was released July 2006. It encoded 99,089 characters.

New blocks[edit]

  • N'Ko, containing 59 characters for the N'Ko script, was added.
  • Balinese, containing 121 characters and musical signs for the Balinese abugida, was added.
  • Latin Extended-C, containing 17 letters for various languages, was added.
  • Latin Extended-D, containing 2 characters for UPA, was added.
  • Phags-pa, containing 56 characters for the Phags-pa script, was added.
  • Phoenician, containing 27 letters and numerals for the Phoenician script, was added.
  • Cuneiform, containing 879 signs for Sumero-Akkadian Cuneiform, was added.
  • Cuneiform Numbers and Punctuation, containing 103 numerals and punctuation signs for Sumero-Akkadian Cuneiform, was added.
  • Counting Rod Numerals, containing 18 numerals used with counting rods, was added.

Extended blocks[edit]

  • Various letters used mainly for aboriginal languages (total 14 characters) were added to Latin Extended-B.
  • Lowercase lunate sigma symbols (total 3 characters) were added to Greek and Coptic.
  • Lowercase palochka and 3 letters used in Nivkh (total 7 characters) were added to Cyrillic.
  • Two letters used in Khanty and other languages (total 4 characters) were added to Cyrillic Supplement.
  • A specific point meant for Vav (ֺ) (total 1 character) was added to Hebrew.
  • Four letters used in Sindhi (total 4 characters) were added to Devanagari.
  • Four letters used in Sanskrit (total 4 characters) were added to Kannada.
  • Additional IPA diacritics (total 9 characters) were added to Combining Diacritical Marks Supplement.
  • Four combining arrows (total 4 characters) were added to Combining Diacritical Marks for Symbols.
  • A danish symbol and a lowercase turned F (total 2 characters) were added to Letterlike Symbols.
  • A lowercase reversed C (ↄ) (total 1 character) was added to Number Forms.
  • Vertical parenthesis, geometric forms and electrical symbols (total 12 characters) were added to Miscellaneous Technical.
  • A neuter symbol (⚲) (total 1 character) was added to Miscellaneous Symbols.
  • Four additional mathematical symbols (total 4 characters) were added to Miscellaneous Mathematical Symbols-A.
  • Additional squares, pentagons and hexagons (total 11 characters) were added to Miscellaneous Symbols and Arrows.
  • Four additional tone letters used in Chinantec (total 4 characters) were added to Modifier Tone Letters.
  • Bold Digamma (𝟊/Ϝ) (total 2 characters) was added to Mathematical Alphanumeric Symbols.

Unicode 5.1[edit]

Unicode 5.1 was released March 2008. It encoded 100,713 characters.

New blocks[edit]

  • Sundanese, containing 55 characters, was added.
  • Lepcha, containing 74 characters, was added.
  • Ol Chiki, containing 48 characters, was added.
  • Cyrillic Extended-A, containing 32 characters, was added.
  • Vai, containing 300 characters, was added.
  • Cyrillic Extended-B, containing 78 characters, was added.
  • Saurashtra, containing 81 characters, was added.
  • Kayah Li, containing 48 characters, was added.
  • Rejang, containing 37 characters, was added.
  • Cham, containing 83 characters, was added.
  • Ancient Symbols, containing 12 characters, was added.
  • Phaistos Disc, containing 46 characters, was added.
  • Lycian, containing 29 characters, was added.
  • Carian, containing 49 characters, was added.
  • Lydian, containing 27 characters, was added.
  • Mahjong Tiles, containing 44 characters, was added.
  • Domino Tiles, containing 100 characters, was added.

Extended blocks[edit]

  • (total 7 characters) were added to Greek and Coptic.
  • (total 1 character) was added to Cyrillic.
  • (total 16 characters) were added to Cyrillic Supplement.
  • (total 15 characters) were added to Arabic.
  • (total 18 characters) were added to Arabic Supplement.
  • (total 2 characters) were added to Devanagari.
  • (total 2 characters) were added to Gurmukhi.
  • (total 3 characters) were added to Oriya.
  • (total 1 character) was added to Tamil.
  • (total 13 characters) were added to Telugu.
  • (total 17 characters) were added to Malayalam.
  • (total 6 characters) were added to Tibetan.
  • (total 78 characters) were added to Myanmar.
  • (total 1 character) was added to Mongolian.
  • (total 28 characters) were added to Combining Diacritical Marks Supplement.
  • (total 10 characters) were added to Latin Extended Additional.
  • (total 1 character) was added to General Punctuation.
  • (total 1 character) was added to Combining Diacritical Marks for Symbols.
  • (total 1 character) was added to Letterlike Symbols.
  • (total 4 characters) were added to Number Forms.
  • (total 15 characters) were added to Miscellaneous Symbols.
  • (total 5 characters) were added to Miscellaneous Mathematical Symbols-A.
  • (total 51 characters) were added to Miscellaneous Symbols and Arrows.
  • (total 12 characters) were added to Latin Extended-C.
  • (total 23 characters) were added to Supplemental Punctuation.
  • (total 1 character) was added to Bopomofo.
  • (total 20 characters) were added to CJK Strokes.
  • (total 8 characters) were added to CJK Unified Ideographs.
  • (total 5 characters) were added to Modifier Tone Letters.
  • (total 112 characters) were added to Latin Extended-D.
  • (total 3 characters) were added to Combining Half Marks.
  • (total 1 character) was added to Musical Symbols.

Unicode 5.2[edit]

Unicode 5.2 was released in October 2009. It encoded 107,361 characters.

New blocks[edit]

  • Samaritan, containing 61 characters, was added.
  • Unified Canadian Aboriginal Syllabics Extended, containing 70 characters, was added.
  • Tai Tham, containing 127 characters, was added.
  • Vedic Extensions, containing 35 characters, was added.
  • Lisu, containing 48 characters, was added.
  • Bamum, containing 88 characters, was added.
  • Common Indic Number Forms, containing 10 characters, was added.
  • Devanagari Extended, containing 28 characters, was added.
  • Hangul Jamo Extended-A, containing 29 characters, was added.
  • Javanese, containing 91 characters, was added.
  • Myanmar Extended-A, containing 28 characters, was added.
  • Tai Viet, containing 72 characters, was added.
  • Meetei Mayek, containing 56 characters, was added.
  • Hangul Jamo Extended-B, containing 72 characters, was added.
  • Imperial Aramaic, containing 31 characters, was added.
  • Old South Arabian, containing 32 characters, was added.
  • Avestan, containing 61 characters, was added.
  • Inscriptional Parthian, containing 30 characters, was added.
  • Inscriptional Pahlavi, containing 27 characters, was added.
  • Old Turkic, containing 73 characters, was added.
  • Rumi Numeral Symbols, containing 31 characters, was added.
  • Kaithi, containing 66 characters, was added.
  • Egyptian hieroglyphics, containing 1,071 characters, was added.
  • Enclosed Alphanumeric Supplement, containing 63 characters, was added.
  • Enclosed Ideographic Supplement, containing 44 characters, was added.
  • CJK Unified Ideographs Extension C, containing 4,149 characters, was added.

Extended blocks[edit]

  • (total 2 characters) was added to Cyrillic Supplement.
  • (total 5 characters) was added to Devanagari.
  • (total 1 character) was added to Bengali.
  • (total 4 characters) was added to Tibetan.
  • (total 4 characters) was added to Myanmar.
  • (total 16 characters) was added to Hangul Jamo.
  • (total 10 characters) was added to Unified Canadian Aboriginal Syllabics.
  • (total 3 characters) was added to New Tai Lue.
  • (total 1 character) was added to Combining Diacritical Marks Supplement.
  • (total 3 characters) was added to Currency Symbols.
  • (total 4 characters) was added to Number Forms.
  • (total 1 characters) was added to Miscellaneous Technical.
  • (total 59 characters) was added to Miscellaneous Symbols.
  • (total 1 character) was added to Dingbats.
  • (total 5 characters) was added to Miscellaneous Symbols and Arrows.
  • (total 3 characters) was added to Latin Extended-C.
  • (total 7 characters) was added to Coptic.
  • (total 1 character) was added to Supplemental Punctuation.
  • (total 12 characters) was added to Enclosed CJK Letters and Months.
  • (total 8 characters) was added to CJK Unified Ideographs.
  • (total 3 characters) was added to CJK Compatibility Ideographs.
  • (total 2 characters) was added to Phoenician.

Unicode 6.0[edit]

Unicode 6.0 was released in October 2010. It encoded 109,449 characters.

New Blocks[edit]

  • Mandaic, containing 29 characters, was added.
  • Batak, containing 56 characters, was added.
  • Ethiopic Extended-A, containing 32 characters, was added.
  • Brahmi, containing 108 characters, was added.
  • Bamum Supplement, containing 761 characters, was added.
  • Kana Supplement, containing 2 characters, was added.
  • Playing Cards, containing 59 characters, was added.
  • Miscellaneous Symbols and Pictographs, containing 529 characters, was added.
  • Emoticons, containing 63 characters, was added.
  • Transport and Map Symbols, containing 70 characters, was added.
  • Alchemical Symbols, containing 116 characters, was added.
  • CJK Unified Ideographs Extension D, containing 222 characters, was added.

Extended blocks[edit]

  • (total 2 characters) was added to Cyrillic Supplement.
  • (total 2 characters) was added to Arabic.
  • (total 10 characters) was added to Devanagari.
  • (total 6 characters) was added to Oriya.
  • (total 3 characters) was added to Malayalam.
  • (total 6 characters) was added to Tibetan.
  • (total 2 characters) was added to Ethiopic.
  • (total 1 character) was added to Combining Diacritical Marks Supplement.
  • (total 8 characters) was added to Superscripts and Subscripts.
  • (total 1 character) was added to Currency Symbols.
  • (total 11 characters) was added to Miscellaneous Technical.
  • (total 6 characters) was added to Miscellaneous Symbols.
  • (total 16 characters) was added to Dingbats.
  • (total 2 characters) was added to Miscellaneous Mathematical Symbols-A.
  • (total 2 characters) was added to Tifinagh.
  • (total 3 characters) was added to Bopomofo Extended.
  • (total 2 characters) was added to Cyrillic Extended-B.
  • (total 15 characters) was added to Latin Extended-D.
  • (total 16 characters) was added to Arabic Presentation Forms-A.
  • (total 107 characters) was added to Enclosed Alphanumeric Supplement.
  • (total 13 characters) was added to Enclosed Ideographic Supplement.

Unicode 6.1[edit]

Unicode 6.1 was released in January 2012. It encoded 110,181 characters.

A total of 732 new characters have been added.

New Blocks[edit]

  • Arabic Extended-A, containing 39 characters, was added.
  • Sundanese Supplement, containing 8 characters, was added.
  • Meetei Mayek Extensions, containing 23 characters, was added.
  • Meroitic Hieroglyphs, containing 32 characters, was added.
  • Meroitic Cursive, containing 26 characters, was added.
  • Sora Sompeng, containing 35 characters, was added.
  • Chakma, containing 67 characters, was added.
  • Sharada, containing 83 characters, was added.
  • Takri, containing 66 characters, was added.
  • Miao, containing 133 characters, was added.
  • Arabic Mathematical Alphabetic Symbols, containing 143 characters, was added.

Extended blocks[edit]

  • (total 1 character) was added to Armenian.
  • (total 1 character) was added to Arabic.
  • (total 1 character) was added to Gujarati.
  • (total 2 characters) was added to Lao.
  • (total 5 characters) was added to Georgian.
  • (total 9 characters) was added to Sundanese.
  • (total 4 characters) was added to Vedic Extensions.
  • (total 2 characters) was added to Miscellaneous Mathematical Symbols-A.
  • (total 2 characters) was added to Coptic.
  • (total 2 characters) was added to Georgian Supplement.
  • (total 2 characters) was added to Tifinagh.
  • (total 10 characters) was added to Supplemental Punctuation.
  • (total 1 character) was added to CJK Unified Ideographs.
  • (total 9 characters) was added to Cyrillic Extended-B.
  • (total 5 characters) was added to Latin Extended-D.
  • (total 2 characters) was added to CJK Compatibility Ideographs.
  • (total 2 characters) was added to Enclosed Alphanumeric Supplement.
  • (total 4 characters) was added to Miscellaneous Symbols and Pictographs.
  • (total 13 characters) was added to Emoticons.

Unicode 6.2[edit]

Unicode 6.2.0 a minor release, as of September 2012 is the prior version.

It encodes 110,117 characters.

Extended blocks[edit]

  • (total 1 character) accelerated publication of a single character: U+20BA TURKISH LIRA SIGN.
Plus a 6.1.1 amendment  
  • (total 1 character) accelerated publication of a single character: U+20B9 INDIAN RUPEE SIGN.

Unicode 6.3[edit]

Unicode 6.3.0 a minor release, as of September 2013 is the current version.

It encodes 110,187 characters.

Extended blocks[edit]

  • (total 4 characters) were added to General Punctuation.
  • (total 1 characters) were added to Arabic.