This section providing information on handling Chinese character string literals in Big5 encoding.
Chinese character strings should use UTF-8 encoding. But for some reason, if you have to use Big5 encoding for your Chinese character strings,
you can use PHP string as binary strings to store Chinese character strings in Big5 encoding.
In order to output Chinese characters to Web pages and display them correctly, you need to:
Enter Chinese characters in string literals in PHP scripts in Big5 encoding.
Handle Chinese character strings with normal string functions.
Output Chinese character strings to Web pages with the echo() or print() function.
Set charset=big5 in the HTML document header.
Make sure that PHP script files are saved in Big5 encoding.
Here is a simple test I did on my local system:
1. Run my Chinese text editor that supports Big5 encoding.
2. Enter the following PHP script file:
<?php #String-Big5.php
# Copyright (c) 2007 by Dr. Herong Yang, http://www.herongyang.com/
#
$help = '?????????????';
print('<html>');
print('<meta http-equiv="Content-Type"'.
' content="text/html; charset=big5"/>');
print('<body>');
print('<b>Chinese string in Big5 in PHP</b><br/>');
print($help.'<br/>');
print('</body>');
print('</html>');
?>
You see some question marks (?) in the source code listed above, because
this book uses UTF-8 encoding. Big5 encoded characters can not be included here.
3. Save the as String-Big5.php in Big5 encoding.
On my Chinese text editor, I had to select "GB text file" as the "Save as type" to ensure my document was saved in Big5 encoding.
Like many other Chinese text editors, it supports multiple encodings. If you are not careful, the document could be saved with a wrong encoding.
4. Copy String-Big5.php to \local\apache\htdocs.
5. Now run Internet Explorer (IE) with http://localhost/String-Big5.php.
You should see Chinese characters displayed correctly:
This proves that the editor: notepad, the CGI program: PHP CGI, the Web server: Apache, and the Web browser: IE,
all worked correctly with Chinese characters in Big5 encoding.