Unicode Tutorials - Herong's Tutorial Examples - v5.32, by Herong Yang
ISO-2022-JP Encoding
This section provides a quick introduction of ISO-2022-JP encoding, which maps a JIS X0208 character to a 2-byte sequence by using both bytes of the character's code value directly.
ISO-2022-JP: An encoding for JIS X0208 character set. It is a 7-bit encoding with 1 or 2 bytes per character:
Number Of Valid Range Bytes Byte 1 Byte 2 1 0x21 - 0x7F 2 0x21 - 0x7E 0x21 - 0x7E
Of course, 1-byte encoding sequences are used for ASCII characters.
2-byte encoding sequences are used for JIS X0208 characters. The mapping schema is simple. The first byte of a encoding sequence is the high byte value of the character code value. The second byte of a encoding sequence is the low byte value of the character code value.
In another word, ISO-2022-JP encoding maps a JIS X0208 character to a 2-byte sequence with both byte values in the range of from 0x21 to 0x7E.
Escape sequences are used to switch between the 1-byte sequence mode for ASCII characters and the 2-byte sequence mode for JIS X0208 characters:
<Esc>(J - Escape sequence for ASCII characters <Esc>$B - Escape sequence for JIS X0208 characters
The advantage of ISO-2022-JP encoding is that it uses 7-bit bytes, which are safe to transmit through any communication interfaces.
The disadvantage of ISO-2022-JP encoding is that it uses escape sequences to mix ASCII characters with JIS X0208 characters.
Table of Contents
ASCII Character Set and Encoding
GB2312 Character Set and Encoding
GB18030 Character Set and Encoding
►JIS X0208 Character Set and Encodings
JIS X0208 Character Set for Japanese Characters
JIS X0208 Character Code Values
UTF-8 (Unicode Transformation Format - 8-Bit)
UTF-16, UTF-16BE and UTF-16LE Encodings
UTF-32, UTF-32BE and UTF-32LE Encodings
Python Language and Unicode Characters
Java Language and Unicode Characters
Encoding Conversion Programs for Encoded Text Files
Using Notepad as a Unicode Text Editor
Using Microsoft Word as a Unicode Text Editor