Building Chinese Web Sites using PHP
Dr. Herong Yang, Version 2.11

HTML Document Travel Path

This section providing information on how HTML documents travel through various applications from you as a publisher to a user as a viewer of a Web page.

In order to create and server HTML documents in Chinese correctly, you need to understand how a HTML document is being created and transferred through various applications to reach the viewer's screen. Here is a simplified diagram that shows the steps and applications used to create and deliver HTML documents to viewer's screen:

H1. Key Sequences from keyboard
      |
      |- Text editor
      v
H2. HTML Document
      |
      |- Web server
      v
H3. HTTP Response
      |
      |- Internet TCP/IP Connection
      v
H4. HTTP Response
      |
      |- Web browser
      v
H5. Visual characters on the screen

If you decided to use UTF-8 encoding to enter Chinese characters in your HTML documents, you need to make sure that all applications mentioned in the above diagram are friendly to UTF-8 encoding. Otherwise, corrupted characters could be introduced during the transfer process and displayed on viewer's screen.

Fortunately, most editors, Web servers, TCP/IP interfaces, and Web browsers do support UTF-8 nicely.

But if you decided to use GB or Big5 encoding to enter Chinese characters in your HTML documents, you may to need to verify those applications to make sure they support GB or Big5 encoding.

The HTML document travel path diagram can also help you to troubleshoot problem in displaying Chinese Web pages. The best strategy to use diagnostic tool to capture the HTML document at different steps and review it to see if there are any damages.

Sections in This Chapter

Chinese Character Set Encoding Options

HTML Document Travel Path

Chinese Web Pages with UTF-8 Encoding

Chinese Web Pages with GB18030 Encoding

Chinese Web Pages with Big5 Encoding

UTF-8 Encoding Pages with GB18030 Characters

Dr. Herong Yang, updated in 2007
HTML Document Travel Path