Unicode Tutorials - Herong's Tutorial Notes
Dr. Herong Yang, Version 5.00

Supported Save and Open File Formats

This section provides a quick summary on Word saving and opening Unicode files correctly with the BOM character prepended. But Word can also open Unicode files without the BOM character prepended with the correct encoding selected manually.

Now we learned that Word can save Unicode text files in 3 encoding formats:

  • Unicode (UTF-8) format - Text files saved in UTF-8 byte sequences with BOM, 0xEFBBBF, prepended.
  • Unicode (Big-Endian) format - Text files saved in UTF-16 byte sequences in Big-Endian with BOM format.
  • Unicode format - Text files saved in UTF-16 byte sequences in Little-Endian with BOM format.

Word can open Unicode text files in 6 encoding formats,

  • UTF-8 format - Text files opened with encoding format automatically detected.
  • UTF-8 with BOM format - Text files opened with encoding format automatically detected.
  • UTF-16 (Big-Endian with BOM) - Text files opened with encoding format automatically detected.
  • UTF-16 (Little-Endian with BOM) - Text files opened with encoding format automatically detected.
  • UTF-16BE format - Text files can be opened if you select the "Unicode (Big-Endian)" encoding option manually.
  • UTF-16LE format - Text files can be opened if you select the Unicode encoding option manually.

Sections in This Chapter

What Is Microsoft Word?

Opening UTF-8 Text Files

Opening UTF-16BE Text Files

Opening UTF-16LE Text Files

Saving Files in "Unicode (UTF-8)" Option

Saving Files in "Unicode (Big-Endian)" Option

Saving Files in Unicode Option

Supported Save and Open File Formats

Dr. Herong Yang, updated in 2009
Supported Save and Open File Formats