PHP Tutorials - Herong's Tutorial Notes
Dr. Herong Yang, Version 2.21

Non ASCII Characters as String Literals

Part:   1  2  3  4  5 

PHP Tutorials - Herong's Tutorial Notes © Dr. Herong Yang

Non ASCII Characters with MySQL

Inputting Non ASCII Characters

Controlling Response Header Lines

HTTP Request Variables

Sessions

Using Cookies

PHP SOAP Extension

PHP SOAP Extension - Server

Directories, Files and Images

Using MySQL with PHP

... Table of Contents

(Continued from previous part...)

Chinese Characters in String Literals - GB2312 Encoding

I think we are ready to test Chinese characters in PHP scripts with GB2312 encoding schema now.

1. This time, we can not use Notepad, because Notepad is not compatible with GB2312 encoding. It will actually convert GB2312 encoding to UTF-8 encoding. So don't use Notepad.

You need to go get another text editor, like Jext, to help you enter the Chinese characters in GB2312 encoding.

2. In a good text editor, enter the following HTML document:

<?php #HelpGb2312Chinese.php
# Copyright (c) 2005 by Dr. Herong Yang, http://www.herongyang.com/
#
   print('<html>');
   print('<meta http-equiv="Content-Type"'.
      'content="text/html; charset=gb2312"/>');
   print('<body>');
   print('<b>说明</b><br/>');
   print("这是一份非常间单的说明书…<br/>");
   print('</body>');
   print('</html>');
?>

Be careful, when you read the above code in this book, Chinese characters may not be displayed correctly. The reason is again that my book is written in ISO-8859-1 encoding.

3. Entering Chinese characters in GB2312 encoding also requires some Chinese input tools. If you don't have any Chinese input tool, you can simply go to my GB2312 page, http://www.herongyang.com/gb2312_gb/, open the source code of the page, copy some Chinese characters, and paste them into the editor. My GB2312 page is encoded in GB2312. Warning, do not copy Chinese characters from the IE browser window. The browser window copy function is assuming UTF-8 encoding and will corrupt the copied characters.

4. Select menu File > Save as. Enter the file name as HelpGb2312Chinese.php and click the Save button.

5. Copy HelpGb2312Chinese.php to c:\inetpub\wwwroot. Make sure your Internet Information Service is running the local default Web site.

6. Now run Internet Explorer (IE) with http://localhost/HelpGb2312Chinese.php. Your should see the Chinese characters displayed correctly.

7. On the IE window, select menu View > Encoding. You should see Gb2312 is selected.

Still not hard to do, right? The key point is to use an editor that compatible with GB2312.

Characters of Multiple Languages in String Literals

After going through the above examples, you should feel comfortable now on how to handle non-ASCII characters of any single language. You have a choice of using UTF-8 or a language specific encoding.

If you want to have characters of multiple languages in a single PHP script, then you have to use UTF-8 encoding. Here are the steps you can follow make a PHP script in UTF-8 for a number of languages.

1. On a Windows system, run Start > All Programs > Accessories > Notepad.

2. In Notepad, enter the following PHP script:

<?php #HelpUtf8MultiLanguages.php
# Copyright (c) 2005 by Dr. Herong Yang, http://www.herongyang.com/
#
   print('<html>');
   print('<meta http-equiv="Content-Type"'.
      ' content="text/html; charset=utf-8"/>');
   print('<body>');
   print('<b>Test</b><br/>');
   print('English: Hello world!<br/>');
   print('Spanish: ola mundo!<br/>');
   print('Korean: ???? ?? !<br/>');
   print('Chinese: ????!<br/>');
   print('</body>');
   print('</html>');
?>

Again, you will some "?" in the above source in this book. This is because my book is using ISO-8859-1 encoding.

(Continued on next part...)

Part:   1  2  3  4  5 

Dr. Herong Yang, updated in 2006
PHP Tutorials - Herong's Tutorial Notes - Non ASCII Characters as String Literals