EUC-JP Encoding

Unicode Tutorials - Herong's Tutorial Examples

∟EUC-JP Encoding

This section provides a quick introduction of EUC-JP encoding, which maps a JIS X0208 character to a 2-byte sequence by adding 128 (0x80) to both bytes of the character's code value.

EUC-JP (Extended Unix Code for Japanese): An encoding for JIS X0208 character set. It is an 8-bit encoding with 1 or 2 bytes per character:

Number Of   Valid Range
Bytes       Byte 1        Byte 2       

   1        0x21 - 0x7F
   2        0xA1 - 0xFE   0xA1 - 0xFE

Of course, 1-byte encoding sequences are used for ASCII characters.

2-byte encoding sequences are used for JIS X0208 characters. The mapping schema is simple. The first byte of a encoding sequence is the high byte value of the character code value plus 128 (0x80). The second byte of a encoding sequence is the low byte value of the character code value plus 128 (0x80).

In another word, EUC-JP encoding maps a JIS X0208 character to a 2-byte sequence with both byte values in the range of from 0xA1 to 0xFE, as shown in the picture below: