Building Chinese Web Sites using PHP
Dr. Herong Yang, Version 2.11

Transmitting Non-ASCII Characters between PHP and MySQL

This section provides information on how non-ASCII characters are transmitted between PHP and MySQL.

To understand why the test in the previous tutorial failed, we need to understand how data where transmitted between PHP and MySQL.

Based on the manual, MySQL communicates with client programs, like PHP scripts or SQL Monitor, in character mode and applies encoding conversions. Here is a simple diagram that shows how MySQL manages the character conversion while communicating with client programs:

Client program sends SQL statement
   |
   | Encoding: A, defined as "character_set_client"
   v
MySQL server - Convertion from encoding A to encoding B
   |
   | Encoding: B, defined as "character_set_connection"
   v
MySQL server - Execution to store data
MySQL server - Conversion from encoding B to encoding C
   |
   | Encoding: C, defined by text column encoding 
   v
MySQL server - Storage
...
MySQL server - Storage
   |
   | Encoding: C, defined by text column encoding
   v
MySQL server - Execution to fetch data
MySQL server - Convertion from encoding C to encoding D
   |  
   | Encoding: D, defined as "character_set_results"
   v
Client program receives result set

As you can see from this diagram, there are 4 character set encodings, 3 conversion processes. involved from the point where a client program sends a SQL statement to the point where the client program receives the result data:

  • Encoding A - The character set encoding used by the client program to encode the SQL statement string. The client program should tell MySQL server what is encoding A by setting the MySQL session variable, "character_set_client".
  • Encoding B - The character set encoding used by MySQL to encode the received SQL statement before execution. The client program should suggest MySQL server what encoding B to use by setting the MySQL session variable, "character_set_connection".
  • Conversion from encoding A to encoding B - MySQL performs this conversion on the received SQL statement before execution.
  • Encoding C - The character set encoding used by MySQL to encode the text data for a given table column. The client program defines what encoding C to use when defining table columns.
  • Conversion from encoding B to encoding C - MySQL performs this conversion on any text data included in the SQL statement when storing it to a given table column.
  • Encoding D - The character set encoding used by MySQL to encode the result set returned the SQL statement. The client program should tell MySQL server what is encoding D by setting the MySQL session variable, "character_set_results".
  • Conversion from encoding C to encoding D - MySQL performs this conversion on any text data fetch from a given table column to the returning result set string.

Obviously, if you want to ensure your text data to correctly transmitted, stored, and fetched into and from MySQL server, you must properly set encodings A, B, C, and D. See the next section for detail discussions.

Sections in This Chapter

Specifying Character Set for Text Columns

Creating a Table with Multiple Character Sets

Checking Character Set Setting

Storing ASCII Characters in Non-ASCII Columns

Storing Non-ASCII Characters with Encoded Bytes

Transmitting Non-ASCII Characters between PHP and MySQL

Viewing Character Set Variables

Non-ASCII Test Result Analysis

Fetching Non-ASCII Text from MySQL

Dr. Herong Yang, updated in 2007
Transmitting Non-ASCII Characters between PHP and MySQL