Unicode Tutorials - Herong's Tutorial Examples - v5.32, by Herong Yang
Unicode Signs in Different Encodings
This section provides a tutorial example on how to write sample programs to create some Unicode signs in various encodings and view them in a Web browser.
I wanted to play with my utility programs mentioned in this chapter one more time with some Unicode signs. So I copied UnicodeHello.java and created UnicodeSign.java:
/* UnicodeSign.java
* Copyright (c) 2019 HerongYang.com. All Rights Reserved.
*
* This program is a simple tool to allow you to enter several lines of
* text, and write them into a file using the specified encoding
* (charset name). The input text lines uses Java string convention,
* which allows you to enter ASCII characters directly, and any non
* ASCII characters with escape sequences.
*
* This version of the program is to write out some interesting signs.
*/
import java.io.*;
class UnicodeSign {
public static void main(String[] a) {
// The following Array contains text to be saved into the output
// File. To enter your own text, just replace this Array.
String[] text = {
"U+005C(\\)REVERSE SOLIDUS", //\u005C is '\', cannot be entered directly
"U+007E(\u007E)TILDE",
"U+00A2(\u00A2)CENT SIGN",
"U+00A3(\u00A3)POUND SING",
"U+00A5(\u00A5)YEN SIGN",
"U+00A6(\u00A6)BROKEN BAR",
"U+00A7(\u00A7)SECTION SIGN",
"U+00A9(\u00A9)COPYRIGHT SIGN",
"U+00AC(\u00AC)NOT SIGN",
"U+00AE(\u00AE)REGISTERED SIGN",
"U+2022(\u2022)BULLET",
"U+2023(\u2023)TRIANGULAR BULLET",
"U+203B(\u203B)REFERENCE MARK",
"U+2043(\u2043)HYPHEN BULLET",
"U+FF04(\uFF04)FULLWIDTH DOLLAR SIGN",
"U+FF05(\uFF05)FULLWIDTH PERCENT SIGN",
"U+FF08(\uFF08)FULLWIDTH LEFT PARENTHESIS",
"U+FF09(\uFF09)FULLWIDTH RIGHT PARENTHESIS",
"U+FF10(\uFF10)FULLWIDTH DIGIT ZERO",
"U+FF11(\uFF11)FULLWIDTH DIGIT ONE",
"U+FF21(\uFF21)FULLWIDTH LATIN CAPITAL LETTER A",
"U+FF22(\uFF22)FULLWIDTH LATIN CAPITAL LETTER B",
"U+FF41(\uFF41)FULLWIDTH LATIN SMALL LETTER A",
"U+FF42(\uFF42)FULLWIDTH LATIN SMALL LETTER B",
"U+FFE0(\uFFE0)FULLWIDTH CENT SIGN",
"U+FFE1(\uFFE1)FULLWIDTH POND SIGN",
"U+FFE5(\uFFE5)FULLWIDTH YEN SIGN"
};
String outFile = "sign.utf-16be";
if (a.length>0) outFile = a[0];
String outCharsetName = "utf-16be";
if (a.length>1) outCharsetName = a[1];
String crlf = System.getProperty("line.separator");
try {
OutputStreamWriter out = new OutputStreamWriter(
new FileOutputStream(outFile), outCharsetName);
for (int i=0; i<text.length; i++) {
out.write(text[i]);
out.write(crlf);
}
out.close();
} catch (IOException e) {
System.out.println(e.toString());
}
}
}
Then I ran this program, and converted the output file with different encodings:
javac UnicodeSign.java java UnicodeSign sign.utf-16be utf-16be java EncodingConverter sign.utf-16be utf-16be sign.utf-8 utf-8 java EncodingHtml sign.utf-8 utf-8 java EncodingConverter sign.utf-16be utf-16be sign.gbk gbk java EncodingHtml sign.gbk gbk java EncodingConverter sign.utf-16be utf-16be sign.shift_jis shift_jis java EncodingHtml sign.shif_jis shift_jis java EncodingConverter sign.utf-16be utf-16be sign.johab johab java EncodingHtml sign.johab johab
Then I viewed the different encoded test files with IE, and noticed that:
Table of Contents
ASCII Character Set and Encoding
GB2312 Character Set and Encoding
GB18030 Character Set and Encoding
JIS X0208 Character Set and Encodings
UTF-8 (Unicode Transformation Format - 8-Bit)
UTF-16, UTF-16BE and UTF-16LE Encodings
UTF-32, UTF-32BE and UTF-32LE Encodings
Python Language and Unicode Characters
Java Language and Unicode Characters
►Encoding Conversion Programs for Encoded Text Files
\uxxxx - Entering Unicode Data in Java Programs
HexWriter.java - Converting Encoded Byte Sequences to Hex Values
EncodingConverter.java - Encoding Conversion Sample Program
Viewing Encoded Text Files in Web Browsers
►Unicode Signs in Different Encodings
Using Notepad as a Unicode Text Editor
Using Microsoft Word as a Unicode Text Editor