Inputting Chinese Text to MySQL Database in UTF-8

This section describes how to take Chinese text from Web form and store it in MySQL database in UTF-8 encoding.

After tested storing and retrieving Chinese text to MySQL database, I continued on testing how to take Chinese input text directly from Web forms and save it to database. The first test was using UTF-8 as the character encoding. I need to remember three important settings:

  • I need to use a <meta> tag to set Content-Type = "text/html; charset=utf-8" for the Web page HTML document. This is to tell the Web browser that display page content in UTF-8 encoding and take form input text in UTF-8 encoding.
  • On the MySQL side, I need to set two session control variables: character_set_client=utf8 and character_set_connection=utf8 when saving input text to the database table. This is to tell MySQL server that my SQL statement is encoded as UTF-8 and keep it as UTF-8 when executing the statement.
  • When retrieving text data from MySQL, I need to set one session control variable: character_set_results=utf8. This is to tell MySQL server that result set must be sent back in UTF-8 encoding.

Here is my PHP script for the test Web page:

<?php #MySQL-Input-Chinese-UTF8.php
# Copyright (c) 2007 by Dr. Herong Yang,
  print('<meta http-equiv="Content-Type"'.
    ' content="text/html; charset=utf-8"/>');

# Default input text
  $input = '电视机/電視機';
  $input_hex = 'E794B5E8A786E69CBA2FE99BBBE8A696E6A99F'; 

# Form submit detection
  $submit = isset($_REQUEST["Submit"]);

# Process form input data
  if ($submit) {
    if (isset($_REQUEST["Input"])) {
      $input = $_REQUEST["Input"];
    $con = mysql_connect("localhost", "Herong", "TopSecret");
    $ok = mysql_select_db("HerongDB", $con);
    $test_name = "Input Chinese UTF-8";

#   Set character_set_client and character_set_connection
    mysql_query("SET character_set_client=utf8", $con);
    mysql_query("SET character_set_connection=utf8", $con);

#   Delete the record
    $sql = "DELETE FROM Comment_Mixed WHERE Test_Name ='$test_name'";
    mysql_query($sql, $con);

#   Build the SQL INSERT statement
    $sql = <<<END_OF_MESSAGE
INSERT INTO Comment_Mixed (Test_name, String_ASCII, 
    String_Latin1, String_UTF8, String_GBK, String_Big5)
  VALUES ('$test_name', null, null, '$input', null, null);

#   Run the SQL statement
    mysql_query($sql, $con);


# Display form
  print('<input type="Text" size="40" maxlength="64"'
   . ' name="Input" value="'.$input.'"/><br/>');
  print('<input type="Submit" name="Submit" value="Submit"/>');

# Generate reply
  if ($submit) {
    $con = mysql_connect("localhost", "Herong", "TopSecret");
    $ok = mysql_select_db("HerongDB", $con);

#   Set character_set_results
    mysql_query("SET character_set_results=utf8", $con);

    $sql = "SELECT * FROM Comment_Mixed"
      . " WHERE Test_Name = '$test_name'";
    $res = mysql_query($sql, $con);
    $output = 'SELECT failed.';
    if ($row = mysql_fetch_array($res)) {
      $output = $row['String_UTF8'];

    print('  text/html; charset=utf-8'."\n");
    print('You have submitted:'."\n");
    print('  Text = '.$input."\n");
    print('  Text in HEX = '.strtoupper(bin2hex($input))."\n");
    print('  Default HEX = '.$input_hex."\n");
    print('Saved and retrieved from database:'."\n");
    print('  Text = '.$output."\n");
    print('  Text in HEX = '.strtoupper(bin2hex($output))."\n");



After moving this PHP script file to Apache server document directory, I tested it with Internet Explorer (IE) with this URL: http://localhost/MySQL-Input-Chinese-UTF8.php. I saw a Web page with a form that has the suggested input text and a submit button.

The default input Chinese characters were displayed correctly.

After clicking the submit button, I saw a returning Web page with the same form and a reply section. The Chinese input characters were received by PHP correctly. They were stored in MySQL database and retrieved back correctly:
Conclusion: Chinese text can be entered on Web forms, received by PHP scripts, stored in MySQL database, and retrieved back to Web pages correctly in UTF-8 encoding.

