Unicode Tutorials - Herong's Tutorial Notes
Dr. Herong Yang, Version 5.00

Supported Character Encodings in JDK 1.4.1

This section provides a list supported character encodings supported by JDK 1.4.1.

JDK uses the java.nio.charset.Charset class to represent a character encoding, with both encode() method and decode() method. It also provides a method, availableCharsets(), to return all supported encodings. Here is a program to display all the supported character encodings:

/**
 * Encodings.java
 * Copyright (c) 2002 by Dr. Herong Yang
 */
import java.nio.charset.*;
import java.util.*;
class Encodings {
   public static void main(String[] arg) {
      SortedMap m = Charset.availableCharsets();
      Set k = m.keySet();
      System.out.println("Canonical name, Display name,"
         +" Can encode, Aliases");
      Iterator i = k.iterator();
      while (i.hasNext()) {
         String n = (String) i.next();
         Charset e = (Charset) m.get(n);
         String d = e.displayName();
         boolean c = e.canEncode();
         System.out.print(n+", "+d+", "+c);
         Set s = e.aliases();
         Iterator j = s.iterator();
         while (j.hasNext()) {
            String a = (String) j.next();         
            System.out.print(", "+a);
         }
         System.out.println("");
      }
   }
}

Here is the output:

Canonical name, Display name, Can encode, Aliases
Big5, Big5, true, csBig5
Big5-HKSCS, Big5-HKSCS, true, big5-hkscs, Big5_HKSCS, big5hkscs
EUC-CN, EUC-CN, true
EUC-JP, EUC-JP, true, eucjis, x-eucjp, csEUCPkdFmtjapanese, eucjp, 
   Extended_UNIX_Code_Packed_Format_for_Japanese, x-euc-jp, euc_jp
euc-jp-linux, euc-jp-linux, true, euc_jp_linux
EUC-KR, EUC-KR, true, ksc5601, 5601, ksc5601_1987, ksc_5601, 
   ksc5601-1987, euc_kr, ks_c_5601-1987, euckr, csEUCKR
EUC-TW, EUC-TW, true, cns11643, euc_tw, euctw
GB18030, GB18030, true, gb18030-2000
GBK, GBK, true, GBK
ISCII91, ISCII91, true, iscii, ST_SEV_358-88, iso-ir-153, 
   csISO153GOST1976874
ISO-2022-CN-CNS, ISO-2022-CN-CNS, true, ISO2022CN_CNS
ISO-2022-CN-GB, ISO-2022-CN-GB, true, ISO2022CN_GB
ISO-2022-KR, ISO-2022-KR, true, ISO2022KR, csISO2022KR
ISO-8859-1, ISO-8859-1, true, iso-ir-100, 8859_1, ISO_8859-1, 
   ISO8859_1, 819, csISOLatin1, IBM-819, ISO_8859-1:1987, latin1, 
   cp819, ISO8859-1, IBM819, ISO_8859_1, l1
ISO-8859-13, ISO-8859-13, true
ISO-8859-15, ISO-8859-15, true, 8859_15, csISOlatin9, IBM923, cp923,
   923, L9, IBM-923, ISO8859-15, LATIN9, ISO_8859-15, LATIN0, 
   csISOlatin0, ISO8859_15_FDIS, ISO-8859-15
ISO-8859-2, ISO-8859-2, true
ISO-8859-3, ISO-8859-3, true
ISO-8859-4, ISO-8859-4, true
ISO-8859-5, ISO-8859-5, true
ISO-8859-6, ISO-8859-6, true
ISO-8859-7, ISO-8859-7, true
ISO-8859-8, ISO-8859-8, true
ISO-8859-9, ISO-8859-9, true
JIS0201, JIS0201, true, X0201, JIS_X0201, csHalfWidthKatakana
JIS0208, JIS0208, true, JIS_C6626-1983, csISO87JISX0208, x0208, 
   JIS_X0208-1983, iso-ir-87
JIS0212, JIS0212, true, jis_x0212-1990, x0212, iso-ir-159, 
   csISO159JISC02121990
Johab, Johab, true, ms1361, ksc5601_1992, ksc5601-1992
KOI8-R, KOI8-R, true
Shift_JIS, Shift_JIS, true, shift-jis, x-sjis, ms_kanji, shift_jis, 
   csShiftJIS, sjis, pck
TIS-620, TIS-620, true
US-ASCII, US-ASCII, true, IBM367, ISO646-US, ANSI_X3.4-1986, cp367,
   ASCII, iso_646.irv:1983, 646, us, iso-ir-6, csASCII, 
   ANSI_X3.4-1968, ISO_646.irv:1991
UTF-16, UTF-16, true, UTF_16
UTF-16BE, UTF-16BE, true, X-UTF-16BE, UTF_16BE, ISO-10646-UCS-2
UTF-16LE, UTF-16LE, true, UTF_16LE, X-UTF-16LE
UTF-8, UTF-8, true, UTF8
windows-1250, windows-1250, true
windows-1251, windows-1251, true
windows-1252, windows-1252, true, cp1252
windows-1253, windows-1253, true
windows-1254, windows-1254, true
windows-1255, windows-1255, true
windows-1256, windows-1256, true
windows-1257, windows-1257, true
windows-1258, windows-1258, true
windows-936, windows-936, true, ms936, ms_936
windows-949, windows-949, true, ms_949, ms949
windows-950, windows-950, true, ms950

Sections in This Chapter

What Is Character Encoding?

Supported Character Encodings in JDK 1.4.1

EncodingSampler.java - Testing encode() Methods

Examples of CP1252 and ISO-8859-1 Encodings

Examples of US-ASCII, UTF-8, UTF-16 and UTF-16BE Encodings

Examples of GB18030 Encoding

Testing decode() Methods

Dr. Herong Yang, updated in 2009
Supported Character Encodings in JDK 1.4.1