This section provides a quick introduction of ASCII (American Standard Code for Information Interchange) character set and encoding.

Before we jump into Unicode character set and Unicode encodings, we should first look at a much older and simpler character set, ASCII.

What Is ASCII? ASCII (American Standard Code for Information Interchange) is a character set and an encoding schema for English letters, numbers and some control characters.

The ASCII specification was published as "American Standard Code for Information Interchange, ASA X3.4-1963" by American Standards Association, in June 17, 1963.

The ASCII character set contains 95 printable characters and 33 control characters, giving a total of 128 characters. Their code points are integers range from 0 to 127, which can be mapped to 7 bits in binary format.

The ASCII encoding is simple, each character is mapped to 1 byte with the leading bit set to 0 and other 7 bits representing the character's code point as an integer.

Here is a picture of an ASCII code chart:

ASCII Code Chat
ASCII Code Chat

Table of Contents

 About This Book

 Character Sets and Encodings

ASCII Character Set and Encoding


 Listing of ASCII Characters and Encoded Bytes

 GB2312 Character Set and Encoding

 GB18030 Character Set and Encoding

 JIS X0208 Character Set and Encodings

 Unicode Character Set

 UTF-8 (Unicode Transformation Format - 8-Bit)

 UTF-16, UTF-16BE and UTF-16LE Encodings

 UTF-32, UTF-32BE and UTF-32LE Encodings

 Python Language and Unicode Characters

 Java Language and Unicode Characters

 Character Encoding in Java

 Character Set Encoding Maps

 Encoding Conversion Programs for Encoded Text Files

 Using Notepad as a Unicode Text Editor

 Using Microsoft Word as a Unicode Text Editor

 Using Microsoft Excel as a Unicode Text Editor

 Unicode Fonts

 Archived Tutorials


 Full Version in PDF/EPUB