JDK (Java Development Kit) Tutorials
Dr. Herong Yang, Version 5.00

Viewing Encoded Text Files in Web Browsers

This section provides a tutorial example on how to view text files with different encodings with Web browser Internet Explorer. The encoded text file should be modified to add proper HTML tags using the sample program EncodingHtml.java.

Now, we have our greeting messages saved in many different encodings. The next question is how do display them as glyph of the corresponding languages on the screen. One of the ways I have used in the past is to run a multi-language enabled Web browser like IE to view the text files. To do this, we have to mark up the text into a html file, by using a program like this one:

/**
 * EncodingHtml.java
 * Copyright (c) 2002 by Dr. Herong Yang
 * 
 * This program allows you to mark up a text file into html file.
 */
import java.io.*;
import java.util.*;
class EncodingHtml {
   static HashMap charsetMap = new HashMap();
   public static void main(String[] a) {
      String inFile = a[0];
      String inCharsetName = a[1];
      String outFile = inFile + ".html";
      try {
         InputStreamReader in = new InputStreamReader(
            new FileInputStream(inFile), inCharsetName);
         OutputStreamWriter out = new OutputStreamWriter(
            new FileOutputStream(outFile), inCharsetName);
         writeHead(out, inCharsetName);         
         int c = in.read();
         int n = 0;
         while (c!=-1) {
            out.write(c);
            n++;
            c = in.read();
         }
         writeTail(out);
         in.close();
         out.close();
         System.out.println("Number of characters: "+n);
      } catch (IOException e) {
         System.out.println(e.toString());
      }
   }
   public static void writeHead(OutputStreamWriter out, String cs)
      throws IOException {
      out.write("<html><head>\n");
      out.write("<meta http-equiv=\"Content-Type\""+
         " content=\"text/html; charset="+cs+"\">\n");
      out.write("</head><body><pre>");
   }
   public static void writeTail(OutputStreamWriter out) 
      throws IOException {
      out.write("</pre></body></html>\n");
   }
}

Now, let's compile this program and run it with hello.utf-8:

javac EncodingHtml.java
java EncodingHtml hello.utf-8 utf-8

If you have installed IE with the Chinese language supports, you should be able to open the output file, hello.utf-8.html, and enjoy reading the messages in English, Simplified Chinese, and Traditional Chinese.

Then, run EncodingHtml.java with other encodings,

java EncodingHtml hello.gbk gbk
java EncodingHtml hello.big5 big5
java EncodingHtml hello.shift_jis shift_jis

View the output files with IE, and compare the results:

  • hello.utf-8.html - IE auto sets View/Encoding to utf-8. All messages are perfect.
  • hello.gbk.html - IE auto sets View/Encoding to gb2312. All messages are perfect.
  • hello.big5.html - IE auto sets View/Encoding to big5. Simplified Chinese message has two bad characters.
  • hello.shift_jis - IE auto sets View/Encoding to shift_jis. Both Simplified and Traditional Chinese messages have bad characters.

If you manually change the setting of View/Encoding, IE will not be able to show the message with the right glyph.

Last update: 2006.

Table of Contents

 About This JDK Tutorial Book

 Downloading and Installing JDK 1.3.1 on Windows

 Downloading and Installing JDK 1.4.1 on Windows

 Downloading and Installing JDK 1.5.0 on Windows

 Downloading and Installing JDK 1.6.2 on Windows

 Date, Time and Calendar Classes

 Date and Time Object and String Conversion

 Number Object and Numeric String Conversion

 Locales, Localization Methods and Resource Bundles

 Calling and Importing Classes Defined in Unnamed Packages

 HashSet, Vector, HashMap and Collection Classes

 Character Set Encoding Classes and Methods

 Character Set Encoding Maps

Encoding Conversion Programs for Encoded Text Files

 \uxxxx - Entering Unicode Data in Java Programs

 HexWriter.java - Converting Encoded Byte Sequences to Hex Values

 EncodingConverter.java - Encoding Conversion Sample Program

Viewing Encoded Text Files in Web Browsers

 Unicode Signs in Different Encodings

 Socket Network Communication

 Datagram Network Communication

 DOM (Document Object Model) - API for XML Files

 SAX (Simple API for XML)

 DTD (Document Type Definition) - XML Validation

 XSD (XML Schema Definition) - XML Validation

 XSL (Extensible Stylesheet Language)

 Message Digest Algorithm Implementations in JDK

 Private key and Public Key Pair Generation

 PKCS#8/X.509 Private/Public Encoding Standards

 Digital Signature Algorithm and Sample Program

 "keytool" Commands and "keystore" Files

 KeyStore and Certificate Classes

 Secret Key Generation and Management

 Cipher - Secret Key Encryption and Decryption

 The SSL (Secure Socket Layer) Protocol

 SSL Socket Communication Testing Programs

 SSL Client Authentication

 HTTPS (Hypertext Transfer Protocol Secure)

 References

 PDF Printing Version

Dr. Herong Yang, updated in 2008
Viewing Encoded Text Files in Web Browsers