Unicode Tutorials - Herong's Tutorial Notes
Dr. Herong Yang, Version 5.00

Unicode Tutorials - Herong's Tutorial Notes

http://www.herongyang.com/Unicode

Copyright © 2009 by Dr. Herong Yang. All rights reserved.

HerongYang.com This free Unicode tutorial book is a collection of notes and sample codes written by the author while he was learning Unicode himself, an ideal tutorial guide for beginners. Topics include ASCII, character set, encoding, GB, GB18030, GB2312, GBK, ISO-8859, JDK, JIS, UTF8, Unicode...

Table of Contents

About This Book

Character Sets and Encodings

What Is Character Set?

Commonly Used Character Sets and Encodings

ASCII Character Set and Encoding

What Is ASCII?

Listing of ASCII Characters and Encoded Bytes

GB2312 Character Set and Encoding

GB2312 Character Set for Chinese Characters

GB2312 Encoding for GB2312 Character Set

Relation of GB2312 and Unicode

GB18030 Character Set and Encoding

History of GB Character Sets

GB18030 Encoding for GB18030 Character Set

JIS X0208 Character Set and Encodings

JIS X0208 Character Set for Japanese Characters

JIS X0208 Character Code Values

EUC-JP Encoding

ISO-2022-JP Encoding

Shift-JIS Encoding

Unicode Character Set

What Is Unicode?

Examples of Unicode Characters

Unique Features of Unicode

Unicode Standard Releases

Code Point Blocks

UTF-8 (Unicode Transformation Format - 8-Bit)

UTF-8 Encoding

UTF-8 Encoding Algorithm

Features of UTF-8 Encoding

UTF-16, UTF-16BE and UTF-16LE Encodings

What Are Paired Surrogates?

UTF-16 Encoding

UTF-16BE Encoding

UTF-16LE Encoding

UTF-32, UTF-32BE and UTF-32LE Encodings

UTF-32 Encoding

UTF-32BE Encoding

UTF-32LE Encoding

Character Encoding in Java

What Is Character Encoding?

Supported Character Encodings in JDK 1.4.1

EncodingSampler.java - Testing encode() Methods

Examples of CP1252 and ISO-8859-1 Encodings

Examples of US-ASCII, UTF-8, UTF-16 and UTF-16BE Encodings

Examples of GB18030 Encoding

Testing decode() Methods

Character Set Encoding Maps

Character Set Encoding Map Analyzer

Character Set Encoding Maps - US-ASCII and ISO-8859-1/Latin 1

Character Set Encoding Maps - CP1252/Windows-1252

Character Set Encoding Maps - Unicode UTF-8

Character Set Encoding Maps - Unicode UTF-16, UTF-16LE, UTF-16BE

Character Counter Program for Any Given Encoding

Character Set Encoding Comparison

Encoding Conversion Programs for Encoded Text Files

\uxxxx - Entering Unicode Data in Java Programs

HexWriter.java - Converting Encoded Byte Sequences to Hex Values

EncodingConverter.java - Encoding Conversion Sample Program

Viewing Encoded Text Files in Web Browsers

Unicode Signs in Different Encodings

Using Notepad as a Unicode Text Editor

What Is Notepad?

Opening UTF-8 Text Files

Opening UTF-16BE Text Files

Opening UTF-16LE Text Files

Saving Files in UTF-8 Option

Byte Order Mark (BOM) - FEFF - EFBBBF

Saving Files in "Unicode Big Endian" Option

Saving Files in "Unicode" Option

Supported Save and Open File Formats

Using Microsoft Word as a Unicode Text Editor

What Is Microsoft Word?

Opening UTF-8 Text Files

Opening UTF-16BE Text Files

Opening UTF-16LE Text Files

Saving Files in "Unicode (UTF-8)" Option

Saving Files in "Unicode (Big-Endian)" Option

Saving Files in Unicode Option

Supported Save and Open File Formats

Using Microsoft Excel as a Unicode Text Editor

What Is Microsoft Excel?

Opening UTF-8 Text Files

Opening UTF-16BE Text Files

Opening UTF-16LE Text Files

Saving UTF-8 Text Files

Saving Files in "Unicode Text (*.txt)" Option

Opening UTF-16 Text Files

Supported Save and Open File Formats

References

Printable Copy - PDF Version

Keywords: Unicode, Universal, Character, Encoding, Tutorial, Book

Previous Version: http://www.herongyang.com/Unicode/index2.html

Dr. Herong Yang, updated in 2009
Table of Contents