What Is Big5 Encoding

Big5 Encoding maps Big5 characters to 2-byte integer codes whose first byte must be greater than or equal to \xA1, second byte must be in the range of \x40 - \x7E or \xA1 - \xFE.

What Is Big5 Encoding? - Big5 Encoding is an encoding that maps characters from the Big5 character set into 2-byte integer codes according the following rules:

1. All possible Big5 codes can be viewed as 2-dimensional matrix with rows representing first byte values and columns presenting second byte values.

2. The first byte of the code must be greater than or equal to 0xA1. This gives (0xFF-0xA1+1) = 95 possible rows in the Big5 code point matrix.

3. The second byte of the code must be in 1 of 2 ranges: 0x40 - 0x7E and 0xA1 - 0xFE. This gives (0x7E-0x40+1) + (0xFE-0xA1+1) = 157 possible columns in the Big5 code point matrix.

4. Characters in the "Special Symbols" block are mapped sequentially into 2-byte integer codes in the range of 0xA140 and 0xA3BF. This gives 2 full rows and a partial row of Big5 codes: (0xA2-0xA1+1)*157 + ((0x7E-0x40+1)+(0xBF-0xA1+1)) = 408 codes. For example, § is mapped to the 2-byte integer of 0xA1B1.

5. Characters in the "Level 1" block are mapped sequentially into 2-byte integer codes in the range of 0xA440 and 0xC67E. This gives 34 full rows and a partial row of Big5 codes: (0xC5-0xA4+1)*157 + (0x7E-0x40+1) = 5401 codes. For example, is mapped to the 2-byte integer of \xA4B4.

6. Characters in the "Level 2" block are mapped sequentially into 2-byte integer codes in the range of 0xC940 and 0xF9D5. This gives 48 full rows and a partial row of Big5 codes: (0xF8-0xC9+1)*157 + ((0x7E-0x40+1)+(0xD5-0xA1+1)) = 7652 codes. For example, is mapped to the 2-byte integer of 0xC9C9.

A list of all Big5 characters and their encoding codes are provided later in this book.

Table of Contents

 About This Book

Introduction to Big5

 What Is Big5 Character Set

What Is Big5 Encoding

 Big5 vs. Unicode

 Big5 Usage Trends

 Big5Unicode.java - Big5 to Unicode Mapping

 Big5 to Unicode Mapping - Special Symbols

 Big5 to Unicode Mapping - Level 1 Characters

 Big5 to Unicode Mapping - Level 2 Characters

 UnicodeBig5.java - Unicode to Big5 Mapping

 Unicode to Big5 Mapping - All 13,461 Characters

 References of This Book - Big5 Tutorials

 Full Version in PDF/ePUB