|
Non ASCII Characters with MySQL
Part:
1
2
3
This chapter explains:
- Storing Non ASCII Characters in Database
- Transmitting Non ASCII Characters to the Server
- MySqlUnicode.php - UTF-8 Sample Script
Storing Non ASCII Characters in Database
MySQL can store non ASCII characters in database in a number of encodings, MySQL call them character sets:
+----------+-----------------------------+
| Charset | Description |
+----------+-----------------------------+
| big5 | Big5 Traditional Chinese |
| dec8 | DEC West European |
| cp850 | DOS West European |
| hp8 | HP West European |
| koi8r | KOI8-R Relcom Russian |
| latin1 | ISO 8859-1 West European |
| latin2 | ISO 8859-2 Central European |
| swe7 | 7bit Swedish |
| ascii | US ASCII |
| ujis | EUC-JP Japanese |
| sjis | Shift-JIS Japanese |
| cp1251 | Windows Cyrillic |
| hebrew | ISO 8859-8 Hebrew |
| tis620 | TIS620 Thai |
| euckr | EUC-KR Korean |
| koi8u | KOI8-U Ukrainian |
| gb2312 | GB2312 Simplified Chinese |
| greek | ISO 8859-7 Greek |
| cp1250 | Windows Central European |
| gbk | GBK Simplified Chinese |
| latin5 | ISO 8859-9 Turkish |
| armscii8 | ARMSCII-8 Armenian |
| utf8 | UTF-8 Unicode |
| ucs2 | UCS-2 Unicode |
| cp866 | DOS Russian |
| keybcs2 | DOS Kamenicky Czech-Slovak |
| macce | Mac Central European |
| macroman | Mac West European |
| cp852 | DOS Central European |
| latin7 | ISO 8859-13 Baltic |
| cp1256 | Windows Arabic |
| cp1257 | Windows Baltic |
| binary | Binary pseudo charset |
| geostd8 | GEOSTD8 Georgian |
+----------+-----------------------------+
To store non ASCII characters in a database column, you need to define that column with
a specific character set. You can specify a character set at 3 levels: database, table, and column.
For example:
CREATE DATABASE db_name CHARACTER SET utf8
CREATE TABLE tbl_name (...) CHARACTER SET utf8
CREATE TABLE tbl_name (col_name CHAR(80) CHARACTER SET utf8, ...)
- If a character set is specified at the database level, it is applied to all CHAR, VARCHAR,
and TEXT columns of all tables in this database.
- If a character set is specified at the table level, it is applied to all CHAR, VARCHAR,
and TEXT columns in this table.
- If a character set is specified at the column level, it is applied to this column only.
- The column length specified in the table creation statement is counted at the character level,
not at the encoding byte level.
- If "utf8" is used on a CHAR(n) column, this column will require 3*n bytes of storage.
(Continued on next part...)
Part:
1
2
3
|