Unicode Tutorials - Herong's Tutorial Notes
Dr. Herong Yang, Version 5.00

Using Notepad as a Unicode Text Editor

This chapter provides notes and tutorial examples on using Nodepad as a Unicode text editor. Topics including opening Unicode text files in 3 encodings: UTF-8, UTF-16BE, and UTF-16LE; saving and opening Unicode text files with the BOM character.

What Is Notepad?

Opening UTF-8 Text Files

Opening UTF-16BE Text Files

Opening UTF-16LE Text Files

Saving Files in UTF-8 Option

Byte Order Mark (BOM) - FEFF - EFBBBF

Saving Files in "Unicode Big Endian" Option

Saving Files in "Unicode" Option

Supported Save and Open File Formats

Conclusions:

  • Notepad can be used to edit Unicode text files.
  • Notepad allows you to save Unicode text files in UTF-8 encoding. But it prepends the BOM (Byte Order Mark) character to file. This is unnecessary.
  • Notepad allows you to save Unicode text files in UTF-16 encoding in 2 formats: Big-Endian with BOM and Little-Endian with BOM.
  • Notepad can open Unicode text files in UTF-8 and UTF-16LE encodings without the BOM character.
  • Notepad can not open Unicode text files in UTF-16BE encoding format correctly.
  • The BOM character is the "ZERO WIDTH NO-BREAK SPACE" character, U+FEFF, in the Unicode character set.

Dr. Herong Yang, updated in 2009
Using Notepad as a Unicode Text Editor