Building Chinese Web Sites using PHP
Dr. Herong Yang, Version 2.12

Inputting Chinese Text to MySQL Database in UTF-8

This section describes how to take Chinese text from Web form and store it in MySQL database in UTF-8 encoding.

After tested storing and retrieving Chinese text to MySQL database, I continued on testing how to take Chinese input text directly from Web forms and save it to database. The first test was using UTF-8 as the character encoding. I need to remember three important settings:

  • I need to use a <meta> tag to set Content-Type = "text/html; charset=utf-8" for the Web page HTML document. This is to tell the Web browser that display page content in UTF-8 encoding and take form input text in UTF-8 encoding.
  • On the MySQL side, I need to set two session control variables: character_set_client=utf8 and character_set_connection=utf8 when saving input text to the database table. This is to tell MySQL server that my SQL statement is encoded as UTF-8 and keep it as UTF-8 when executing the statement.
  • When retrieving text data from MySQL, I need to set one session control variable: character_set_results=utf8. This is to tell MySQL server that result set must be sent back in UTF-8 encoding.

Here is my PHP script for the test Web page:

<?php #MySQL-Input-Chinese-UTF8.php
# Copyright (c) 2007 by Dr. Herong Yang, http://www.herongyang.com/
#
  print('<html><head>');
  print('<meta http-equiv="Content-Type"'.
    ' content="text/html; charset=utf-8"/>');
  print('</head><body>'."\n");

# Default input text
  $input = '电视机/電視機';
  $input_hex = 'E794B5E8A786E69CBA2FE99BBBE8A696E6A99F'; 

# Form submit detection
  $submit = isset($_REQUEST["Submit"]);

# Process form input data
  if ($submit) {
    if (isset($_REQUEST["Input"])) {
      $input = $_REQUEST["Input"];
    }
    $con = mysql_connect("localhost", "Herong", "TopSecret");
    $ok = mysql_select_db("HerongDB", $con);
    $test_name = "Input Chinese UTF-8";

#   Set character_set_client and character_set_connection
    mysql_query("SET character_set_client=utf8", $con);
    mysql_query("SET character_set_connection=utf8", $con);

#   Delete the record
    $sql = "DELETE FROM Comment_Mixed WHERE Test_Name ='$test_name'";
    mysql_query($sql, $con);

#   Build the SQL INSERT statement
    $sql = <<<END_OF_MESSAGE
INSERT INTO Comment_Mixed (Test_name, String_ASCII, 
    String_Latin1, String_UTF8, String_GBK, String_Big5)
  VALUES ('$test_name', null, null, '$input', null, null);
END_OF_MESSAGE;

#   Run the SQL statement
    mysql_query($sql, $con);

    mysql_close($con); 
  }

# Display form
  print('<form>');
  print('<input type="Text" size="40" maxlength="64"'
   . ' name="Input" value="'.$input.'"/><br/>');
  print('<input type="Submit" name="Submit" value="Submit"/>');
  print('</form>'."\n");

# Generate reply
  if ($submit) {
    $con = mysql_connect("localhost", "Herong", "TopSecret");
    $ok = mysql_select_db("HerongDB", $con);

#   Set character_set_results
    mysql_query("SET character_set_results=utf8", $con);

    $sql = "SELECT * FROM Comment_Mixed"
      . " WHERE Test_Name = '$test_name'";
    $res = mysql_query($sql, $con);
    $output = 'SELECT failed.';
    if ($row = mysql_fetch_array($res)) {
      $output = $row['String_UTF8'];
    }  
    mysql_free_result($res);

    print('<pre>'."\n");
    print('Content-Type:'."\n");
    print('  text/html; charset=utf-8'."\n");
    print('You have submitted:'."\n");
    print('  Text = '.$input."\n");
    print('  Text in HEX = '.strtoupper(bin2hex($input))."\n");
    print('  Default HEX = '.$input_hex."\n");
    print('Saved and retrieved from database:'."\n");
    print('  Text = '.$output."\n");
    print('  Text in HEX = '.strtoupper(bin2hex($output))."\n");
    print('</pre>'."\n");

    mysql_close($con); 
  } 

  print('</body></html>');
?>

After moving this PHP script file to Apache server document directory, I tested it with Internet Explorer (IE) with this URL: http://localhost/MySQL-Input-Chinese-UTF8.php. I saw a Web page with a form that has the suggested input text and a submit button.

The default input Chinese characters were displayed correctly.

After clicking the submit button, I saw a returning Web page with the same form and a reply section. The Chinese input characters were received by PHP correctly. They were stored in MySQL database and retrieved back correctly:
Inputting Chinese Text to MySQL Database in UTF-8

Conclusion: Chinese text can be entered on Web forms, received by PHP scripts, stored in MySQL database, and retrieved back to Web pages correctly in UTF-8 encoding.

Table of Contents

 About This Book

 PHP Installation on Windows Systems

 Integrating PHP with Apache Web Server

 charset="*" - Encodings on Chinese Web Pages

 Chinese Characters in PHP String Literals

 Multibyte String Functions in UTF-8 Encoding

 Input Text Data from Web Forms

 Input Chinese Text Data from Web Forms

 MySQL - Installation on Windows

 MySQL - Connecting PHP to Database

 MySQL - Character Set and Encoding

 MySQL - Sending Non-ASCII Text to MySQL

 Retrieving Chinese Text from Database to Web Pages

Input Chinese Text Data to MySQL Database

 Steps and Application Components Involved

Inputting Chinese Text to MySQL Database in UTF-8

 Inputting Chinese Text to MySQL Database in GBK

 Inputting Chinese Text to MySQL Database in Big5

 Summary

 References

 PDF Printing Version

Dr. Herong Yang, updated in 2011
Inputting Chinese Text to MySQL Database in UTF-8