Building Chinese Web Sites using PHP
Dr. Herong Yang, Version 2.11

Processing Chinese Input on Web Forms in UTF-8

This section describes how to display a Web form and process form Chinese input data in UTF-8.

Since UTF-8 encoding can handle both simplified and traditional Chinese characters, I wrote a the first test PHP script for processing Chinese input text with some interesting features:

  • A HTML header tag <meta> is used to set the Web page with charset=utf-8 for UTF-8 encoding.
  • A default input text is provided with some simplified and traditional Chinese characters in UTF-8 encoding.
  • The received text from the $_REQUEST array is displayed back on the returning Web page as encoded characters. It is also displayed in Hex values to compare with the HEX values of the default text.
<?php #Web-Form-Input-Chinese-UTF8.php
# Copyright (c) 2007 by Dr. Herong Yang, http://www.herongyang.com/
#
  print('<html><head>');
  print('<meta http-equiv="Content-Type"'.
    ' content="text/html; charset=utf-8"/>');
  print('</head><body>'."\n");

# Default input text
  $input = '电视机/電視機';
  $input_hex = 'E794B5E8A786E69CBA2FE99BBBE8A696E6A99F'; 

# Form reply determination
  $reply = isset($_REQUEST["Submit"]);

# Process form input data
  if ($reply) {
    if (isset($_REQUEST["Input"])) {
      $input = $_REQUEST["Input"];
    }
  }

# Display form
  print('<form>');
  print('<input type="Text" size="40" maxlength="64"'
   . ' name="Input" value="'.$input.'"/><br/>');
  print('<input type="Submit" name="Submit" value="Submit"/>');
  print('</form>'."\n");

# Display reply
  if ($reply) {
    print('<pre>'."\n");
    print('Content-Type:'."\n");
    print('  text/html; charset=utf-8'."\n");
    print('You have submitted:'."\n");
    print('  Text = '.$input."\n");
    print('  Text in HEX = '.strtoupper(bin2hex($input))."\n");
    print('  Default HEX = '.$input_hex."\n");
    print('</pre>'."\n");
  } 

  print('</body></html>');
?>

After moving this PHP script file to Apache server document directory, I tested it with Internet Explorer (IE) with this URL: http://localhost/Web-Form-Input-Chinese-UTF8.php. I saw a Web page with a form that has the suggested input text and a submit button.

The default input Chinese characters were displayed correctly.

After clicking the submit button, I saw a returning Web page with the same form and a reply section. The Chinese input characters were received by PHP correctly:
Processing Web Form Chinese Input in UTF-8

It is interesting to note that the return Web page has a special URL which contains the input text inside the query string. The Chinese characters are included as Hex values of UTF-8 byte sequences:

http://localhost/Web-Form-Input-Chinese-UTF8.php
  ?Input=%E7%94%B5%E8%A7%86%E6%9C%BA%2F%E9%9B%BB%E8%A6%96%E6%A9%9F
  &Submit=Submit

Conclusion: IE handles Chinese input text in UTF-8 encoding correctly. PHP receives Chinese input text in UTF-8 encoding from Web forms correctly.

Sections in This Chapter

Steps and Components Involved

Processing Chinese Input on Web Forms in UTF-8

Processing Chinese Input on Web Forms in GB18030

Processing Chinese Input on Web Forms in Big5

Copying and Pasting Chinese Input to UTF-8 Web Forms

Copying and Pasting Chinese Input to GB18030 Web Forms

Copying and Pasting Chinese Input to Big5 Web Forms

Dr. Herong Yang, updated in 2007
Processing Chinese Input on Web Forms in UTF-8