Building Chinese Web Sites using PHP
Dr. Herong Yang, Version 2.11

String Literal Travel Path

This section providing information on how string literals in PHP scripts travel through various applications from you as a publisher to a user as a viewer of a Web page.

In order to enter Chinese characters in PHP scripts and output them in Web pages correctly, you need to understand how a character string is being created in PHP, written into a Web page, and transferred through various applications to reach the viewer's screen. Here is a simplified diagram that shows the steps and applications used to create and deliver PHP strings to viewer's screen:

P1. Key Sequences from keyboard
      |
      |- Text editor
      v
P2. PHP Script
      |
      |- PHP-CGI
      v
P3. HTML Document
      |
      |- Web server
      v
P4. HTTP Response
      |
      |- Internet TCP/IP Connection
      v
P5. HTTP Response
      |
      |- Web browser
      v
P6. Visual characters on the screen

If you decided to use UTF-8 encoding to enter Chinese characters in your HTML documents, you need to make sure that all applications mentioned in the above diagram are friendly to UTF-8 encoding. Otherwise, corrupted characters could be introduced during the transfer process and displayed on viewer's screen.

Fortunately, most editors, PHP CGI, Web servers, TCP/IP interfaces, and Web browsers do support UTF-8 nicely.

The PHP string literal travel path diagram can also help you to troubleshoot problem in displaying Chinese Web pages. The best strategy to use diagnostic tool to capture the HTML document at different steps and review it to see if there are any damages.

Sections in This Chapter

String Data Type, Literals and Functions

String Literal Travel Path

Chinese Character String with UTF-8 Encoding

Chinese Character String with GB18030 Encoding

Chinese Character String with Big5 Encoding

UTF-8 Encoding Pages with Big5 Characters

Dr. Herong Yang, updated in 2007
String Literal Travel Path