Building Chinese Web Sites using PHP
Dr. Herong Yang, Version 2.11

php_mbstring.dll - Multibyte String Functions

This section describes how to configure PHP to load php_mbstring.dll to support multibyte string functions for UTF-8 encoding.

PHP built-in string functions does allow you to use PHP string variables to store Chinese character strings in UTF-8, GB18030, or Bug5 encoding. But if you want manipulate, like trim, split, substring, count, etc., you need treat them as binary strings of encoded bytes, not characters.

If you want to manipulate Chinese character strings as characters, you need load the PHP extension module, php_mbstring.dll. This tutorial shows you how to load and config php_mbstring.dll for UTF-8 encoding.

1. Check the PHP configuration file, \local\php\php.ini. If not exist, copy it from \local\php\php.ini-dist:

C:\> copy \local\php\php.ini-dist \local\php\php.ini

2. Open \local\php\php.ini in a text editor, like notepad.

3. Change the setting to allow PHP to load php_mbstring.dll from the \local\php\ext directory:

...
;extension_dir = "./"
extension_dir = "./ext"
...
;extension=php_mbstring.dll
extension=php_mbstring.dll
...

3. Change the setting to set default encoding to UTF-8 for all languages:

...
;mbstring.language = Japanese
mbstring.language = Neutral
...
;mbstring.internal_encoding = EUC-JP
mbstring.internal_encoding = UTF-8
...

There are many other mbstring settings in the configuration file. You can leave them as is, because they do not affect the basic multibyte fucntions. We will review them later in this book.

Sections in This Chapter

php_mbstring.dll - Multibyte String Functions

mb_strlen() - Counting Multibyte Characters

List of Multibyte String Functions

Dr. Herong Yang, updated in 2007
php_mbstring.dll - Multibyte String Functions