Unicode Tutorials - Herong's Tutorial Notes
Dr. Herong Yang, Version 5.00

Examples of CP1252 and ISO-8859-1 Encodings

This section provides examples of encoded byte sequences of the JVM default encoding, CP1252 encoding and ISO-8859-1 encoding. All 3 seem to be behaving in the same way.

Running the testing program, EncodingSampler.java, provided in the previous section without any argument will use the JVM's default encoding:

Default (Cp1252) encoding:
Char, String, Writer, Charset, Encoder
0000, 00, 00, 00, 00
003F, 3F, 3F, 3F, 3F
0040, 40, 40, 40, 40
007F, 7F, 7F, 7F, 7F
0080, 3F, 3F, 3F, 00
00BF, BF, BF, BF, BF
00C0, C0, C0, C0, C0
00FF, FF, FF, FF, FF
0100, 3F, 3F, 3F, 00
3FFF, 3F, 3F, 3F, 00
4000, 3F, 3F, 3F, 00
7FFF, 3F, 3F, 3F, 00
8000, 3F, 3F, 3F, 00
BFFF, 3F, 3F, 3F, 00
C000, 3F, 3F, 3F, 00
EFFF, 3F, 3F, 3F, 00
F000, 3F, 3F, 3F, 00
FFFF, 3F, 3F, 3F, 00

The results shows that:

  • The default encoding of the String class seems to be the same as OutputStreamWriter: Cp1252.
  • There are a number of characters that can not be encoded by Cp1252. The String, OutputStreamWriter, and Charset classes are returning 0x3F for those non-encodable characters.
  • It's obvious that Cp1252 works on a character set in the 0x0000 - 0x00FF range.

Running the program again with 'CP1252' as argument should give us the same output as the previous run:

CP1252 encoding:
Char, String, Writer, Charset, Encoder
0000, 00, 00, 00, 00
003F, 3F, 3F, 3F, 3F
0040, 40, 40, 40, 40
007F, 7F, 7F, 7F, 7F
0080, 3F, 3F, 3F, 00
00BF, BF, BF, BF, BF
00C0, C0, C0, C0, C0
00FF, FF, FF, FF, FF
0100, 3F, 3F, 3F, 00
3FFF, 3F, 3F, 3F, 00
4000, 3F, 3F, 3F, 00
7FFF, 3F, 3F, 3F, 00
8000, 3F, 3F, 3F, 00
BFFF, 3F, 3F, 3F, 00
C000, 3F, 3F, 3F, 00
EFFF, 3F, 3F, 3F, 00
F000, 3F, 3F, 3F, 00
FFFF, 3F, 3F, 3F, 00

Let's try another encoding, ISO-8859-1:

ISO-8859-1 encoding:
Char, String, Writer, Charset, Encoder
0000, 00, 00, 00, 00
003F, 3F, 3F, 3F, 3F
0040, 40, 40, 40, 40
007F, 7F, 7F, 7F, 7F
0080, 80, 80, 80, 80
00BF, BF, BF, BF, BF
00C0, C0, C0, C0, C0
00FF, FF, FF, FF, FF
0100, 3F, 3F, 3F, 00
3FFF, 3F, 3F, 3F, 00
4000, 3F, 3F, 3F, 00
7FFF, 3F, 3F, 3F, 00
8000, 3F, 3F, 3F, 00
BFFF, 3F, 3F, 3F, 00
C000, 3F, 3F, 3F, 00
EFFF, 3F, 3F, 3F, 00
F000, 3F, 3F, 3F, 00
FFFF, 3F, 3F, 3F, 00

It appears to be the same as CP1252.

Sections in This Chapter

What Is Character Encoding?

Supported Character Encodings in JDK 1.4.1

EncodingSampler.java - Testing encode() Methods

Examples of CP1252 and ISO-8859-1 Encodings

Examples of US-ASCII, UTF-8, UTF-16 and UTF-16BE Encodings

Examples of GB18030 Encoding

Testing decode() Methods

Dr. Herong Yang, updated in 2009
Examples of CP1252 and ISO-8859-1 Encodings