PHP Tutorials - Herong's Tutorial Notes
Dr. Herong Yang, Version 2.21

Receiving Non ASCII Characters from Input Forms

Part:   1  2  3  4  5  6  7 

PHP Tutorials - Herong's Tutorial Notes © Dr. Herong Yang

Non ASCII Characters with MySQL

Inputting Non ASCII Characters

Controlling Response Header Lines

HTTP Request Variables

Sessions

Using Cookies

PHP SOAP Extension

PHP SOAP Extension - Server

Directories, Files and Images

Using MySQL with PHP

... Table of Contents

(Continued from previous part...)

Now enter the following input strings on InputIsoGetDecoded.php to see what happens:

English ASCII: Hello world!
Spanish UTF-8: ¡Hola mundo!
Korean UTF-8: ???? ?? !
Chinese UTF-8: ????!
Chinese GB2312: ÊÀ½çÄãºÃ£¡

If you click the submit button, you will get:

Input strings before decoding:
English = (Hello world!)
Spanish = (¡Hola mundo!)
Korean = (???? ?? !)
ChineseUtf8 = (????!)
ChineseGb2312 = (ÊÀ½çÄãºÃ£¡)
submit = (Submit)
------
Input strings after decoding:
English = (Hello world!)
Spanish = (¡Hola mundo!)
Korean = (여보세요 세계 !)
ChineseUtf8 = (ä½ å¥½ä¸–ç•Œ!)
ChineseGb2312 = (ÊÀ½çÄãºÃ£¡)
submit = (Submit)

The first section shows you input strings as they are received in HTML entity encoding. The second section shows you input strings as they are decoded from HTML entity encoding to UTF-8 encoding.

Conclusion

  • How non ASCII characters are recorded on a Web page depends on the "charset" setting of the page.
  • URL encoding is applied when input strings are transferred to the server.
  • PHP CGI module applies URL decoding when parsing input strings into $_REQUEST.
  • My suggestion is to use "charset=utf-8" for your input pages. No need to worry about HTML entity conversion.

Part:   1  2  3  4  5  6  7 

Dr. Herong Yang, updated in 2006
PHP Tutorials - Herong's Tutorial Notes - Receiving Non ASCII Characters from Input Forms