Commonly Used Character Sets and Encodings

This section provides a list of commonly used character sets and their encodings.

The following table summaries some commonly used character sets and encodings:

Character      Encoding       # of    Byte    Language
Set                           Bytes   Type  

ASCII          ASCII          1       7-bit   English
Latin1         ISO-8859-1     1       8-bit   Latin languages
GB2312-1980    GB             1-2     8-bit   Chinese
GB2312-1980    EUC-CN         1-2     8-bit   Chinese
GB2312-1980    HZ             1-2     7-bit   Chinese
GBK            GBK            1-2     8-bit   Chinese
GB18030-2000   GB18030-2000   1-4     8-bit   Chinese
Big5           Big5           1-2     8-bit   Chinese
CNS 11643-1992 EUC-TW         1-4     8-bit   Chinese
JIS            EUC-JP         1-2     8-bit   Japanese
JIS            ISO-2022-JP    1-2     7-bit   Japanese
JIS            Shift-JIS      1-2     8-bit   Japanese
KS             EUC-KR         1-2     8-bit   Korean
KS             ISO-2022-KR    1-2     7-bit   Korean
Unicode 3.0    UTF-7          1-3     8-bit   Multilingual
Unicode 3.0    UTF-8          1-3     8-bit   Multilingual
Unicode 3.0    UTF-16BE       2       8-bit   Multilingual
Unicode 3.0    UTF-16LE       2       8-bit   Multilingual
Unicode 3.1    UTF-8          1-4     8-bit   Multilingual

Table of Contents

 About This Book

Character Sets and Encodings

 What Is Character Set

Commonly Used Character Sets and Encodings

 ASCII Character Set and Encoding

 GB2312 Character Set and Encoding

 GB18030 Character Set and Encoding

 JIS X0208 Character Set and Encodings

 Unicode Character Set

 UTF-8 (Unicode Transformation Format - 8-Bit)

 UTF-16, UTF-16BE and UTF-16LE Encodings

 UTF-32, UTF-32BE and UTF-32LE Encodings

 Python Language and Unicode Characters

 Java Language and Unicode Characters

 Character Encoding in Java

 Character Set Encoding Maps

 Encoding Conversion Programs for Encoded Text Files

 Using Notepad as a Unicode Text Editor

 Using Microsoft Word as a Unicode Text Editor

 Using Microsoft Excel as a Unicode Text Editor

 Unicode Fonts

 Archived Tutorials

 References

 Full Version in PDF/EPUB