Unicode Tutorials - Herong's Tutorial Examples - Version 5.20, by Dr. Herong Yang
This section provides a quick introduction of ISO-2022-JP encoding, which maps a JIS X0208 character to a 2-byte sequence by using both bytes of the character's code value directly.
ISO-2022-JP: An encoding for JIS X0208 character set. It is a 7-bit encoding with 1 or 2 bytes per character:
Number Of Valid Range Bytes Byte 1 Byte 2 1 0x21 - 0x7F 2 0x21 - 0x7E 0x21 - 0x7E
Of course, 1-byte encoding sequences are used for ASCII characters.
2-byte encoding sequences are used for JIS X0208 characters. The mapping schema is simple. The first byte of a encoding sequence is the high byte value of the character code value. The second byte of a encoding sequence is the low byte value of the character code value.
In another word, ISO-2022-JP encoding maps a JIS X0208 character to a 2-byte sequence with both byte values in the range of from 0x21 to 0x7E.
Escape sequences are used to switch between the 1-byte sequenc mode for ASCII characters and the 2-byte sequence mode for JIS X0208 characters:
<Esc>(J - Escape sequence for ASCII characters <Esc>$B - Escape sequence for JIS X0208 characters
The advantage of ISO-2022-JP encoding is that it uses 7-bit bytes, which are safe to transmit through any communication interfaces.
The disadvantage of ISO-2022-JP encoding is that it uses escape sequences to mix ASCII characters with JIS X0208 characters.
Last update: 2009.
Table of Contents