Unicode Tutorials - Herong's Tutorial Examples - v5.32, by Herong Yang
GB2312 Character Set for Chinese Characters
This section provides a quick introduction of the GB2312 character set for simplified Chinese characters, numbers and symbols. GB2312 contains 7445 characters.
GB: An abbreviation of Guojia Biaozhun, or Guo Biao, meaning "national standard" in Chinese.
GB2312, also called GB2312-1980: A coded character set established by the government of People's Republic of China (PRC) in 1980.
Main features of GB2312-1980:
GB2312-1980 arranges characters into a matrix of 94 rows and 94 columns. The rows are called quwei, and are organized as follows:
Rows # of Qu Wei Chars Characters 01 94 Special symbols 02 72 Paragraph numbers 03 94 GB 1988-80 (ISO 646-CN) 04 83 Hiragana 05 86 Katakana 06 48 Greek 07 66 Cyrillic 08 63 Pinyin accented vowels and zhuyin symbols 09 76 Box and table drawing pieces 16-55 3755 Hanzi level 1, ordered by pinyin 56-87 3008 Hanzi level 2, ordered by radical, then stroke
GB2312-1980 is a Double-Byte Character Set (DBCS), in which code point values requires 2-byte integers to hold. This is very different than the ASCII and Latin 1 character sets where every code point value can be hold by a 1-byte integer.
Table of Contents
ASCII Character Set and Encoding
►GB2312 Character Set and Encoding
►GB2312 Character Set for Chinese Characters
GB2312 Encoding for GB2312 Character Set
Relation of GB2312 and Unicode
GB18030 Character Set and Encoding
JIS X0208 Character Set and Encodings
UTF-8 (Unicode Transformation Format - 8-Bit)
UTF-16, UTF-16BE and UTF-16LE Encodings
UTF-32, UTF-32BE and UTF-32LE Encodings
Python Language and Unicode Characters
Java Language and Unicode Characters
Encoding Conversion Programs for Encoded Text Files
Using Notepad as a Unicode Text Editor
Using Microsoft Word as a Unicode Text Editor