Unicode Code Point Blocks - Code Charts

This chapter provides notes and tutorial examples on Unicode code point blocks or code charts. Topics including block name, code point range, sample code points, etc.

U0000: C0 Controls and Basic Latin

U0080: C1 Controls and Latin-1 Supplement

U0100: Latin Extended-A

U0180: Latin Extended-B

U0250: IPA Extensions

U02B0: Spacing Modifier Letters

U0300: Combining Diacritical Marks

U0370: Greek and Coptic

U0400: Cyrillic

U0500: Cyrillic Supplement

U0530: Armenian

U0590: Hebrew

U0600: Arabic

U0700: Syriac

U0750: Arabic Supplement

U0780: Thaana

U07C0: N'Ko

U0800: Samaritan

U0840: Mandaic

U0860: Syriac Supplement

U08A0: Arabic Extended-A

U0900: Devanagari

U0980: Bengali

U0A00: Gurmukhi

U0A80: Gujarati

U0B00: Oriya

U0B80: Tamil

U0C00: Telugu

U0C80: Kannada

U0D00: Malayalam

U0D80: Sinhala

U0E00: Thai

U0E80: Lao

U0F00: Tibetan

U1000: Myanmar

U10A0: Georgian

U1100: Hangul Jamo

U1200: Ethiopic

U1380: Ethiopic Supplement

U13A0: Cherokee

U1400: Unified Canadian Aboriginal Syllabics

U1680: Ogham

U16A0: Runic

U1700: Tagalog

U1720: Hanunoo

U1740: Buhid

U1760: Tagbanwa

U1780: Khmer

U1800: Mongolian

U18B0: Unified Canadian Aboriginal Syllabics Extended

U1900: Limbu

U1950: Tai Le

U1980: New Tai Lue

U19E0: Khmer Symbols

U1A00: Buginese

U1A20: Tai Tham

U1B00: Balinese

U1B80: Sundanese

U1BC0: Batak

U1C00: Lepcha

U1C50: Ol Chiki

U1CC0: Sundanese Supplement

U1CD0: Vedic Extensions

U1D00: Phonetic Extensions

U1D80: Phonetic Extensions Supplement

U1DC0: Combining Diacritical Marks Supplement

U1E00: Latin Extended Additional

U1F00: Greek Extended

U2000: General Punctuation

U2070: Superscripts and Subscripts

U20A0: Currency Symbols

U20D0: Combining Diacritical Marks for Symbols

U2100: Letterlike Symbols

U2150: Number Forms

U2190: Arrows

U2200: Mathematical Operators

U2300: Miscellaneous Technical

U2400: Control Pictures

U2440: Optical Character Recognition

U2460: Enclosed Alphanumerics

U2500: Box Drawing

U2580: Block Elements

U25A0: Geometric Shapes

U2600: Miscellaneous Symbols

U2700: Dingbats

U27C0: Miscellaneous Mathematical Symbols-A

U27F0: Supplemental Arrows-A

U2800: Braille Patterns

U2900: Supplemental Arrows-B

U2980: Miscellaneous Mathematical Symbols-B

U2A00: Supplemental Mathematical Operators

U2B00: Miscellaneous Symbols and Arrows

U2C00: Glagolitic

U2C60: Latin Extended-C

U2C80: Coptic

U2D00: Georgian Supplement

U2D30: Tifinagh

U2D80: Ethiopic Extended

U2DE0: Cyrillic Extended-A

U2E00: Supplemental Punctuation

U2E80: CJK Radicals Supplement

U2F00: Kangxi Radicals

U2FF0: Ideographic Description Characters

U3000: CJK Symbols and Punctuation

U3040: Hiragana

U30A0: Katakana

U3100: Bopomofo

U3130: Hangul Compatibility Jamo

U3190: Kanbun

U31A0: Bopomofo Extended

U31C0: CJK Strokes

U31F0: Katakana Phonetic Extensions

U3200: Enclosed CJK Letters and Months

U3300: CJK Compatibility

U3400: CJK Unified Ideographs Extension A

U4DC0: Yijing Hexagram Symbols

U4E00: CJK Unified Ideographs

UA000: Yi Syllables

UA490: Yi Radicals

UA4D0: Lisu

UA500: Vai

UA640: Cyrillic Extended-B

UA6A0: Bamum

UA700: Modifier Tone Letters

UA720: Latin Extended-D

UA800: Syloti Nagri

UA830: Common Indic Number Forms

UA840: Phags-pa

UA880: Saurashtra

UA8E0: Devanagari Extended

UA900: Kayah Li

UA930: Rejang

UA960: Hangul Jamo Extended-A

UA980: Javanese

UAA00: Cham

UAA60: Myanmar Extended-A

UAA80: Tai Viet

UAAE0: Meetei Mayek Extensions

UAB00: Ethiopic Extended-A

UAB30: Latin Extended-E

UAB70: Cherokee Supplement

UABC0: Meetei Mayek

UAC00: Hangul Syllables

UD7B0: Hangul Jamo Extended-B

UD800: High Surrogates

UDB80: High Private Use Surrogates

UDC00: Low Surrogates

UE000: Private Use Area

UF900: CJK Compatibility Ideographs

UFB00: Alphabetic Presentation Forms

UFB50: Arabic Presentation Forms-A

UFE00: Variation Selectors

UFE10: Vertical Forms

UFE20: Combining Half Marks

UFE30: CJK Compatibility Forms

UFE50: Small Form Variants

UFE70: Arabic Presentation Forms-B

UFF00: Halfwidth and Fullwidth Forms

UFFF0: Specials

U10000: Linear B Syllabary

U10080: Linear B Ideograms

U10100: Aegean Numbers

U10140: Ancient Greek Numbers

U10190: Ancient Symbols

U101D0: Phaistos Disc

U10280: Lycian

U102A0: Carian

U10300: Old Italic

U10330: Gothic

U10380: Ugaritic

U103A0: Old Persian

U10400: Deseret

U10450: Shavian

U10480: Osmanya

U10800: Cypriot Syllabary

U10840: Imperial Aramaic

U10900: Phoenician

U10920: Lydian

U10980: Meroitic Hieroglyphs

U109A0: Meroitic Cursive

U10A00: Kharoshthi

U10A60: Old South Arabian

U10A80: Old North Arabian

U10AC0: Manichaean

U10B00: Avestan

U10B40: Inscriptional Parthian

U10B60: Inscriptional Pahlavi

U10B80: Psalter Pahlavi

U10C00: Old Turkic

U10C80: Old Hungarian

U10D00: Hanifi Rohingya

U10E60: Rumi Numeral Symbols

U10F00: Old Sogdian

U10F30: Sogdian

U11000: Brahmi

U11080: Kaithi

U110D0: Sora Sompeng

U11100: Chakma

U11150: Mahajani

U11180: Sharada

U111E0: Sinhala Archaic Numbers

U11200: Khojki

U11280: Multani

U112B0: Khudawadi

U11300: Grantha

U11400: Newa

U11480: Tirhuta

U11580: Siddham

U11600: Modi

U11660: Mongolian Supplement

U11680: Takri

U11700: Ahom

U11800: Dogra

U118A0: Warang Citi

U11A00: Zanabazar Square

U11A50: Soyombo

U11AC0: Pau Cin Hau

U11C00: Bhaiksuki

U11C70: Marchen

U11D00: Masaram Gondi

U11D60: Gunjala Gondi

U11EE0: Makasar

U12000: Cuneiform

U12400: Cuneiform Numbers and Punctuation

U12480: Early Dynastic Cuneiform

U13000: Egyptian Hieroglyphs

U14400: Anatolian Hieroglyphs

U16800: Bamum Supplement

U16A40: Mro

U16AD0: Bassa Vah

U16B00: Pahawh Hmong

U16E40: Medefaidrin

U16F00: Miao

U16FE0: Ideographic Symbols and Punctuation

U17000: Tangut

U18800: Tangut Components

U1B000: Kana Supplement

U1B100: Kana Extended-A

U1B170: Nushu

U1BC00: Duployan

U1BCA0: Shorthand Format Controls

U1D000: Byzantine Musical Symbols

U1D100: Musical Symbols

U1D200: Ancient Greek Musical Notation

U1D2E0: Mayan Numerals

U1D300: Tai Xuan Jing Symbols

U1D360: Counting Rod Numerals

U1D400: Mathematical Alphanumeric Symbols

U1D800: Sutton SignWriting

U1E000: Glagolitic Supplement

U1E800: Mende Kikakui

U1E900: Adlam

U1EC70: Indic Siyaq Numbers

U1EE00: Arabic Mathematical Alphabetic Symbols

U1F000: Mahjong Tiles

U1F030: Domino Tiles

U1F0A0: Playing Cards

U1F100: Enclosed Alphanumeric Supplement

U1F200: Enclosed Ideographic Supplement

U1F300: Miscellaneous Symbols And Pictographs

U1F600: Emoticons

U1F650: Ornamental Dingbats

U1F680: Transport And Map Symbols

U1F700: Alchemical Symbols

U1F780: Geometric Shapes Extended

U1F800: Supplemental Arrows-C

U1F900: Supplemental Symbols and Pictographs

U1FA00: Chess Symbols

U20000: CJK Unified Ideographs Extension B

U2A700: CJK Unified Ideographs Extension C

U2B740: CJK Unified Ideographs Extension D

U2B820: CJK Extension-E

U2CEB0: CJK Extension-F

U2F800: CJK Compatibility Ideographs Supplement

UE0000: Tags

UE0100: Variation Selectors Supplement

UF0000: Supplementary Private Use Area-A

U100000: Supplementary Private Use Area-B



 

Table of Contents

 About This Book

 Character Sets and Encodings

 ASCII Character Set and Encoding

 GB2312 Character Set and Encoding

 GB18030 Character Set and Encoding

 JIS X0208 Character Set and Encodings

 Unicode Character Set

 UTF-8 (Unicode Transformation Format - 8-Bit)

 UTF-16, UTF-16BE and UTF-16LE Encodings

 UTF-32, UTF-32BE and UTF-32LE Encodings

 Java Language and Unicode Characters

 Character Encoding in Java

 Character Set Encoding Maps

 Encoding Conversion Programs for Encoded Text Files

 Using Notepad as a Unicode Text Editor

 Using Microsoft Word as a Unicode Text Editor

 Using Microsoft Excel as a Unicode Text Editor

 Unicode Fonts

Unicode Code Point Blocks - Code Charts

 Outdated Tutorials

 References

 Full Version in PDF/EPUB