Unicode Tutorials - Herong's Tutorial Examples
∟UTF-32, UTF-32BE and UTF-32LE Encodings
This chapter provides notes and tutorial examples on UTF-32, UTF-32BE and UTF-32LE encodings. Topics including encoding and decoding logics of UTF-32, UTF-32BE and UTF-32LE encodings; explanation of the use of BOM (Byte Order Mark).
UTF-32 Encoding
UTF-32BE Encoding
UTF-32LE Encoding
Conclusions:
- UTF-32, UTF-32BE and UTF-32LE encodings are all fixed-length 32-bit (4-byte) Unicode character encodings.
- Output byte streams of UTF-32 encoding may have 3 valid formats: Big-Endian without BOM,
Big-Endian with BOM, and Little-Endian with BOM.
- UTF-32BE encoding is identical to the Big-Endian without BOM format of UTF-32 encoding.
- UTF-32LE encoding is identical to the Little-Endian with BOM format of UTF-32 encoding without using BOM.
- "Unicode Standard Annex #19 - UTF-32" at
unicode.org/reports/tr19/tr19-9.html
gives quick and precise definitions of UTF-32, UTF-32BE and UTF-32LE encodings.
Table of Contents
About This Book
Character Sets and Encodings
ASCII Character Set and Encoding
GB2312 Character Set and Encoding
GB18030 Character Set and Encoding
JIS X0208 Character Set and Encodings
Unicode Character Set
UTF-8 (Unicode Transformation Format - 8-Bit)
UTF-16, UTF-16BE and UTF-16LE Encodings
►UTF-32, UTF-32BE and UTF-32LE Encodings
Python Language and Unicode Characters
Java Language and Unicode Characters
Character Encoding in Java
Character Set Encoding Maps
Encoding Conversion Programs for Encoded Text Files
Using Notepad as a Unicode Text Editor
Using Microsoft Word as a Unicode Text Editor
Using Microsoft Excel as a Unicode Text Editor
Unicode Fonts
Archived Tutorials
References
Full Version in PDF/EPUB