PHP Tutorials - Herong's Tutorial Notes
Dr. Herong Yang, Version 2.21

Non ASCII Characters with MySQL

Part:   1  2  3 

PHP Tutorials - Herong's Tutorial Notes © Dr. Herong Yang

Non ASCII Characters with MySQL

Inputting Non ASCII Characters

Controlling Response Header Lines

HTTP Request Variables

Sessions

Using Cookies

PHP SOAP Extension

PHP SOAP Extension - Server

Directories, Files and Images

Using MySQL with PHP

... Table of Contents

This chapter explains:

  • Storing Non ASCII Characters in Database
  • Transmitting Non ASCII Characters to the Server
  • MySqlUnicode.php - UTF-8 Sample Script

Storing Non ASCII Characters in Database

MySQL can store non ASCII characters in database in a number of encodings, MySQL call them character sets:

+----------+-----------------------------+
| Charset  | Description                 |
+----------+-----------------------------+
| big5     | Big5 Traditional Chinese    |
| dec8     | DEC West European           |
| cp850    | DOS West European           |
| hp8      | HP West European            |
| koi8r    | KOI8-R Relcom Russian       |
| latin1   | ISO 8859-1 West European    |
| latin2   | ISO 8859-2 Central European |
| swe7     | 7bit Swedish                |
| ascii    | US ASCII                    |
| ujis     | EUC-JP Japanese             |
| sjis     | Shift-JIS Japanese          |
| cp1251   | Windows Cyrillic            |
| hebrew   | ISO 8859-8 Hebrew           |
| tis620   | TIS620 Thai                 |
| euckr    | EUC-KR Korean               |
| koi8u    | KOI8-U Ukrainian            |
| gb2312   | GB2312 Simplified Chinese   |
| greek    | ISO 8859-7 Greek            |
| cp1250   | Windows Central European    |
| gbk      | GBK Simplified Chinese      |
| latin5   | ISO 8859-9 Turkish          |
| armscii8 | ARMSCII-8 Armenian          |
| utf8     | UTF-8 Unicode               |
| ucs2     | UCS-2 Unicode               |
| cp866    | DOS Russian                 |
| keybcs2  | DOS Kamenicky Czech-Slovak  |
| macce    | Mac Central European        |
| macroman | Mac West European           |
| cp852    | DOS Central European        |
| latin7   | ISO 8859-13 Baltic          |
| cp1256   | Windows Arabic              |
| cp1257   | Windows Baltic              |
| binary   | Binary pseudo charset       |
| geostd8  | GEOSTD8 Georgian            |
+----------+-----------------------------+

To store non ASCII characters in a database column, you need to define that column with a specific character set. You can specify a character set at 3 levels: database, table, and column. For example:

   CREATE DATABASE db_name CHARACTER SET utf8
   CREATE TABLE tbl_name (...) CHARACTER SET utf8
   CREATE TABLE tbl_name (col_name CHAR(80) CHARACTER SET utf8, ...)
  • If a character set is specified at the database level, it is applied to all CHAR, VARCHAR, and TEXT columns of all tables in this database.
  • If a character set is specified at the table level, it is applied to all CHAR, VARCHAR, and TEXT columns in this table.
  • If a character set is specified at the column level, it is applied to this column only.
  • The column length specified in the table creation statement is counted at the character level, not at the encoding byte level.
  • If "utf8" is used on a CHAR(n) column, this column will require 3*n bytes of storage.

(Continued on next part...)

Part:   1  2  3 

Dr. Herong Yang, updated in 2006
PHP Tutorials - Herong's Tutorial Notes - Non ASCII Characters with MySQL