Unicode Tutorials - Herong's Tutorial Notes
Dr. Herong Yang, Version 5.00

Shift-JIS Encoding

This section provides a quick introduction of Shift-JIS, also called MS Kanji, encoding, which maps a JIS X0208 character to a 2-byte sequence using a complicated schema designed by Microsoft.

Shift-JIS: An encoding for JIS X0208 character set. It is a 8-bit encoding with 1 to 2 bytes per character:

Number Of   Valid Range
Bytes       Byte 1        Byte 2       

   1        0x21 - 0x7F	(for ASCII)
   1        0xA1 - 0xDF (for Katakana)
   2        0x81 - 0x9F   0x40 - 0x7E
   2        0xE0 - 0xEF   0x80 - 0xFC

Shift-JIS, also called MS Kanji, is a Microsoft standard (codepage 932). The encoding schema is not straightforward. Please read http://en.wikipedia.org/wiki/Shift_JIS for more details.

Sections in This Chapter

JIS X0208 Character Set for Japanese Characters

JIS X0208 Character Code Values

EUC-JP Encoding

ISO-2022-JP Encoding

Shift-JIS Encoding

Dr. Herong Yang, updated in 2009
Shift-JIS Encoding