JDK (Java Development Kit) Tutorials
Dr. Herong Yang, Version 5.00

Running EncodingSampler.java with ISO-8859-1 and US-ASCII

This section provides a tutorial example on how to run the character encoding sample program with ISO-8859-1 and US-ASCII encodings. Character encoding US-ASCII is a subset of ISO-8859-1.

Now, let's run my sample program, EncodingSampler.java, again with another encoding, ISO-8859-1:

ISO-8859-1 encoding:
Char, String, Writer, Charset, Encoder
0000, 00, 00, 00, 00
003F, 3F, 3F, 3F, 3F
0040, 40, 40, 40, 40
007F, 7F, 7F, 7F, 7F
0080, 80, 80, 80, 80
00BF, BF, BF, BF, BF
00C0, C0, C0, C0, C0
00FF, FF, FF, FF, FF
0100, 3F, 3F, 3F, 00
3FFF, 3F, 3F, 3F, 00
4000, 3F, 3F, 3F, 00
7FFF, 3F, 3F, 3F, 00
8000, 3F, 3F, 3F, 00
BFFF, 3F, 3F, 3F, 00
C000, 3F, 3F, 3F, 00
EFFF, 3F, 3F, 3F, 00
F000, 3F, 3F, 3F, 00
FFFF, 3F, 3F, 3F, 00

Ok, ISO-8859-1 appears to be the same as CP1252 based on the test result.

What is the difference between ISO-8859-1 and US-ASCII? Here is the test result of US-ASCII:

US-ASCII encoding:
Char, String, Writer, Charset, Encoder
0000, 00, 00, 00, 00
003F, 3F, 3F, 3F, 3F
0040, 40, 40, 40, 40
007F, 7F, 7F, 7F, 7F
0080, 3F, 3F, 3F, 00
00BF, 3F, 3F, 3F, 00
00C0, 3F, 3F, 3F, 00
00FF, 3F, 3F, 3F, 00
0100, 3F, 3F, 3F, 00
3FFF, 3F, 3F, 3F, 00
4000, 3F, 3F, 3F, 00
7FFF, 3F, 3F, 3F, 00
8000, 3F, 3F, 3F, 00
BFFF, 3F, 3F, 3F, 00
C000, 3F, 3F, 3F, 00
EFFF, 3F, 3F, 3F, 00
F000, 3F, 3F, 3F, 00
FFFF, 3F, 3F, 3F, 00

Obviously, the result shows that encoding US-ASCII has a smaller character set: 0x0000 - 0x007F. Encoding ISO-8859-1 has a larger character set: 0x0000 - 0x00FF. In another word, US-ASCII is a subset of ISO-8859-1.

Last update: 2006.

Sections in This Chapter

What Is Character Encoding?

Supported Character Encodings in JDK

Charset.encode() - Method to Encode Characters

Running EncodingSampler.java with CP1252 Encoding

Running EncodingSampler.java with ISO-8859-1 and US-ASCII

Running EncodingSampler.java with UTF-8, UTF-16, UTF16-BE

Running EncodingSampler.java with GB18030

Charset.decode() - Method to Decode Byte Sequences

Dr. Herong Yang, updated in 2008
Running EncodingSampler.java with ISO-8859-1 and US-ASCII