Sending Non-ASCII Text to MySQL in Multiple Encoding

This section provides a test on sending non-ASCII character strings to MySQL server in multiple encoding.

As you can see from previous tests, when character_set_client and character_set_connection are set to a specific encoding, MySQL expects you to send non-ASCII text data with that encoding.

If you have a table of multiple columns with different encodings, you can not included non-ASCII text data for all columns in a single SQL INSERT statement and expect MySQL to handle it correctly on all columns. One way to solve this problem is to send non-ASCII text with multiple SQL UPDATE statements: one for each column. Before each UPDATE statement, reset character_set_client and character_set_connection to match the encoding of the next statement.

What I did in the next PHP script is to break the INSERT statement into multiple UPDATE statements. Each UPDATE statement sends non-ASCII text data only for one column with proper settings.

<?php 
#- MySQL-Send-Non-ASCII-Mixed.php
#- Copyright (c) 2015, HerongYang.com, All Rights Reserved.
#
  $con = mysql_connect("localhost", "Herong", "TopSecret");
  $ok = mysql_select_db("HerongDB", $con);
  $test_name = "Send Non-ASCII Mixed";

# Set character_set_results
  mysql_query("SET character_set_results=utf8", $con);

# Delete the record
  $sql = "DELETE FROM Comment_Mixed WHERE Test_Name ='$test_name'";
  mysql_query($sql, $con);
  print("\nNumber of rows deleted: ".mysql_affected_rows()."\n");

# Insert a record for ASCII column only
  $sql = <<<END_OF_MESSAGE
INSERT INTO Comment_Mixed (Test_name, String_ASCII)
  VALUES ('$test_name', 'Television')
END_OF_MESSAGE;
  mysql_query($sql, $con);
  print("\nNumber of rows inserted: ".mysql_affected_rows()."\n");

# Update the record for Latin1 column only
  mysql_query("SET character_set_client=latin1", $con);
  mysql_query("SET character_set_connection=latin1", $con);
  $sql = <<<END_OF_MESSAGE
UPDATE Comment_Mixed SET String_Latin1  = 'T?l?vision'
  WHERE Test_name = '$test_name'
END_OF_MESSAGE;
  mysql_query($sql, $con);
  print("\nNumber of rows updated: ".mysql_affected_rows()."\n");

# Update the record for UTF-8 column only
  mysql_query("SET character_set_client=utf8", $con);
  mysql_query("SET character_set_connection=utf8", $con);
  $sql = <<<END_OF_MESSAGE
UPDATE Comment_Mixed SET String_UTF8  = '电视机/電視機'
  WHERE Test_name = '$test_name'
END_OF_MESSAGE;
  mysql_query($sql, $con);
  print("\nNumber of rows updated: ".mysql_affected_rows()."\n");

# Update the record for GBK column only
  mysql_query("SET character_set_client=gbk", $con);
  mysql_query("SET character_set_connection=gbk", $con);
  $sql = <<<END_OF_MESSAGE
UPDATE Comment_Mixed SET String_GBK  = '???'
  WHERE Test_name = '$test_name'
END_OF_MESSAGE;
  mysql_query($sql, $con);
  print("\nNumber of rows updated: ".mysql_affected_rows()."\n");

# Update the record for Big5 column only
  mysql_query("SET character_set_client=big5", $con);
  mysql_query("SET character_set_connection=big5", $con);
  $sql = <<<END_OF_MESSAGE
UPDATE Comment_Mixed SET String_Big5  = '???'
  WHERE Test_name = '$test_name'
END_OF_MESSAGE;
  mysql_query($sql, $con);
  print("\nNumber of rows updated: ".mysql_affected_rows()."\n");

# Get the recod back
  $sql = "SELECT * FROM Comment_Mixed"
    . " WHERE Test_Name = '$test_name'";
  $res = mysql_query($sql, $con);
  if ($row = mysql_fetch_array($res)) {
    print("\nTest Name = ".$row['Test_Name']."\n");
    print("   String_ASCII: 0x".bin2hex($row['String_ASCII'])."\n");
    print("   String_Latin1: 0x".bin2hex($row['String_Latin1'])."\n");
    print("   String_UTF8: 0x".bin2hex($row['String_UTF8'])."\n");
    print("   String_GBK: 0x".bin2hex($row['String_GBK'])."\n");
    print("   String_Big5: 0x".bin2hex($row['String_Big5'])."\n");
  }  
  mysql_free_result($res);

  mysql_close($con); 
?>

Note that string literals in the source code above will not be displayed properly, because this book uses UTF-8 encoding.

The result is perfect:

C:\>\local\php\php MySQL-Send-Non-ASCII-Mixed.php

Number of rows deleted: 0
Number of rows inserted: 1
Number of rows updated: 1
Number of rows updated: 1
Number of rows updated: 1
Number of rows updated: 1

Test Name = Send Non-ASCII Mixed
   String_ASCII: 0x54656c65766973696f6e
   String_Latin1: 0x54c3a96cc3a9766973696f6e
   String_UTF8: 0xe794b5e8a786e69cba2fe99bbbe8a696e6a99f
   String_GBK: 0xe794b5e8a786e69cba
   String_Big5: 0xe99bbbe8a696e6a99f

Last update: 2015.

Table of Contents

 About This Book

 PHP Installation on Windows Systems

 Integrating PHP with Apache Web Server

 charset="*" - Encodings on Chinese Web Pages

 Chinese Characters in PHP String Literals

 Multibyte String Functions in UTF-8 Encoding

 Input Text Data from Web Forms

 Input Chinese Text Data from Web Forms

 MySQL - Installation on Windows

 MySQL - Connecting PHP to Database

 MySQL - Character Set and Encoding

MySQL - Sending Non-ASCII Text to MySQL

 Sending Non-ASCII Text to MySQL in Latin1 Encoding

 Sending Non-ASCII Text to MySQL in UTF8 Encoding

 Sending Non-ASCII Text to MySQL in GBK Encoding

 Sending Non-ASCII Text to MySQL in Big5 Encoding

Sending Non-ASCII Text to MySQL in Multiple Encoding

 Summary - Sending and Fetching Non-ASCII Text

 Retrieving Chinese Text from Database to Web Pages

 Input Chinese Text Data to MySQL Database

 References

 PDF Printing Version