This section provides a tutorial example on how to save text files with Nodepad by selecting the UTF-8 encoding option on the save file dialog box.
After testing the Notepad open function, now I want to test the save function with the UTF-8 encoding.
1. Run Notepad and open hello.utf-8 correctly with the UTF-8 encoding option selected.
2. Click the File > "Save As" menu. The "Save As" dialog box comes up.
3. Enter notepad_utf-8 as the new file name and select UTF-8 option in the Encoding field.
4. Click the Save button. Notepad saves the text to a new file named as: notepad_utf-8.txt.
5. To see how my text is saved by Notepad, I need to run my HEX dump program on notepad_utf-8.txt:
C:\herong\unicode>java HexWriter notepad_utf-8.txt notepad_utf-8.hex
Number of input bytes: 107
C:\herong\unicode>type notepad_utf-8.hex
EFBBBF48656C6C6F20636F6D70757465
7221202D20456E676C6973680D0AE794
B5E88491E4BDA0E5A5BDEFBC81202D20
53696D706C6966696564204368696E65
73650D0AE99BBBE885A6E4BDA0E5A5BD
EFB997202D20547261646974696F6E61
6C204368696E6573650D0A
5. To compare the UTF-8 text file created by Notepad with my original UTF-8 file,
I need to run my HEX dump program on hello.utf-8:
C:\herong\unicode>java HexWriter hello.utf-8 hello_utf-8.hex
Number of input bytes: 104
C:\herong\unicode>type hello_utf-8.hex
48656C6C6F20636F6D70757465722120
2D20456E676C6973680D0AE794B5E884
91E4BDA0E5A5BDEFBC81202D2053696D
706C6966696564204368696E6573650D
0AE99BBBE885A6E4BDA0E5A5BDEFB997
202D20547261646974696F6E616C2043
68696E6573650D0A
The UTF-8 text file saved by Notepad is identical to my original UTF-8 text file except for those 3 bytes in the beginning, "EFBBBF".
If we ignore "EFBBBF", we can say that Notepad saves UTF-8 text file correctly.
So what is this "EFBBBF" and why it is added? See the next section for a brief explanation.