This section provides a list of commonly used character sets and their encodings.
The following table summaries some commonly used character sets and encodings:
Character Encoding # of Byte Language
Set Bytes Type
ASCII ASCII 1 7-bit English
Latin1 ISO-8859-1 1 8-bit Latin languages
GB2312-1980 GB 1-2 8-bit Chinese
GB2312-1980 EUC-CN 1-2 8-bit Chinese
GB2312-1980 HZ 1-2 7-bit Chinese
GBK GBK 1-2 8-bit Chinese
GB18030-2000 GB18030-2000 1-4 8-bit Chinese
Big5 Big5 1-2 8-bit Chinese
CNS 11643-1992 EUC-TW 1-4 8-bit Chinese
JIS EUC-JP 1-2 8-bit Japanese
JIS ISO-2022-JP 1-2 7-bit Japanese
JIS Shift-JIS 1-2 8-bit Japanese
KS EUC-KR 1-2 8-bit Korean
KS ISO-2022-KR 1-2 7-bit Korean
Unicode 3.0 UTF-7 1-3 8-bit Multilingual
Unicode 3.0 UTF-8 1-3 8-bit Multilingual
Unicode 3.0 UTF-16BE 2 8-bit Multilingual
Unicode 3.0 UTF-16LE 2 8-bit Multilingual
Unicode 3.1 UTF-8 1-4 8-bit Multilingual