This section describes how to take Chinese text from Web form and store it in MySQL database in UTF-8 encoding.
After tested storing and retrieving Chinese text to MySQL database, I continued on testing how to take Chinese input text directly from Web forms
and save it to database. The first test was using UTF-8 as the character encoding. I need to remember three important settings:
I need to use a <meta> tag to set Content-Type = "text/html; charset=utf-8"
for the Web page HTML document. This is to tell the Web browser that display page content in UTF-8 encoding
and take form input text in UTF-8 encoding.
On the MySQL side, I need to set two session control variables: character_set_client=utf8
and character_set_connection=utf8 when saving input text to the database table.
This is to tell MySQL server that my SQL statement is encoded as UTF-8
and keep it as UTF-8 when executing the statement.
When retrieving text data from MySQL, I need to set one session control variable: character_set_results=utf8.
This is to tell MySQL server that result set must be sent back in UTF-8 encoding.
Here is my PHP script for the test Web page:
<?php #MySQL-Input-Chinese-UTF8.php
# Copyright (c) 2007 by Dr. Herong Yang, http://www.herongyang.com/
#
print('<html><head>');
print('<meta http-equiv="Content-Type"'.
' content="text/html; charset=utf-8"/>');
print('</head><body>'."\n");
# Default input text
$input = '电视机/電視機';
$input_hex = 'E794B5E8A786E69CBA2FE99BBBE8A696E6A99F';
# Form submit detection
$submit = isset($_REQUEST["Submit"]);
# Process form input data
if ($submit) {
if (isset($_REQUEST["Input"])) {
$input = $_REQUEST["Input"];
}
$con = mysql_connect("localhost", "Herong", "TopSecret");
$ok = mysql_select_db("HerongDB", $con);
$test_name = "Input Chinese UTF-8";
# Set character_set_client and character_set_connection
mysql_query("SET character_set_client=utf8", $con);
mysql_query("SET character_set_connection=utf8", $con);
# Delete the record
$sql = "DELETE FROM Comment_Mixed WHERE Test_Name ='$test_name'";
mysql_query($sql, $con);
# Build the SQL INSERT statement
$sql = <<<END_OF_MESSAGE
INSERT INTO Comment_Mixed (Test_name, String_ASCII,
String_Latin1, String_UTF8, String_GBK, String_Big5)
VALUES ('$test_name', null, null, '$input', null, null);
END_OF_MESSAGE;
# Run the SQL statement
mysql_query($sql, $con);
mysql_close($con);
}
# Display form
print('<form>');
print('<input type="Text" size="40" maxlength="64"'
. ' name="Input" value="'.$input.'"/><br/>');
print('<input type="Submit" name="Submit" value="Submit"/>');
print('</form>'."\n");
# Generate reply
if ($submit) {
$con = mysql_connect("localhost", "Herong", "TopSecret");
$ok = mysql_select_db("HerongDB", $con);
# Set character_set_results
mysql_query("SET character_set_results=utf8", $con);
$sql = "SELECT * FROM Comment_Mixed"
. " WHERE Test_Name = '$test_name'";
$res = mysql_query($sql, $con);
$output = 'SELECT failed.';
if ($row = mysql_fetch_array($res)) {
$output = $row['String_UTF8'];
}
mysql_free_result($res);
print('<pre>'."\n");
print('Content-Type:'."\n");
print(' text/html; charset=utf-8'."\n");
print('You have submitted:'."\n");
print(' Text = '.$input."\n");
print(' Text in HEX = '.strtoupper(bin2hex($input))."\n");
print(' Default HEX = '.$input_hex."\n");
print('Saved and retrieved from database:'."\n");
print(' Text = '.$output."\n");
print(' Text in HEX = '.strtoupper(bin2hex($output))."\n");
print('</pre>'."\n");
mysql_close($con);
}
print('</body></html>');
?>
After moving this PHP script file to Apache server document directory, I tested it with Internet Explorer (IE)
with this URL: http://localhost/MySQL-Input-Chinese-UTF8.php. I saw a Web page with a form that has the
suggested input text and a submit button.
The default input Chinese characters were displayed correctly.
After clicking the submit button, I saw a returning Web page with the same form and a reply section.
The Chinese input characters were received by PHP correctly. They were stored in MySQL database and retrieved
back correctly:
Conclusion: Chinese text can be entered on Web forms, received by PHP scripts, stored in MySQL database,
and retrieved back to Web pages correctly in UTF-8 encoding.