|
Receiving Non ASCII Characters from Input Forms
Part:
1
2
3
4
5
6
7
(Continued from previous part...)
Decoding HTML Entities
As you see earlier in this chapter, if page has "charset=iso-8859-1", Unicode characters will be
received as HTML entities in $_REQUEST. How can we convert them back to Unicode characters?
I have tried with "urldecode()" and "rawurldecode()". They work fine on single-byte characters.
But they do not work with multi-byte characters.
PHP has a special function "html_entity_decode()" to decode HTML entities with multi-byte characters.
Here is the syntax of html_entity_decode():
html_entity_decode(string[, quote_style[, charset]])
where "string" is the HTML entity encoded string; "quote_style" specifies how quotes should be handled;
and "charset" specifies which character set to use. Supported character sets include: ISO-8859-1,
UTF-8, cp1251, GB2312, and Shift_JIS.
To show you how to use html_entity_decode(), I modified InputIsoGet.php to InputIsoGetDecoded.php:
<?php # InputIsoGetDecoded.php
# Copyright (c) 2005 by Dr. Herong Yang, http://www.herongyang.com/
#
#- Promoting CGI values to local variables
global $r_English, $r_Spanish, $r_Korean, $r_ChineseUtf8;
global $r_ChineseGb2312;
import_request_variables("GPC","r_");
#- Generating HTML document
print("<html>");
print('<meta http-equiv="Content-Type"'
.' content="text/html; charset=utf-8"/>');
print("<body>\n");
print("<form action=InputIsoGetDecoded.php method=get>");
print("English ASCII: <input name=English"
." value='$r_English' size=16><br>\n");
print("Spanish UTF-8: <input name=Spanish"
." value='$r_Spanish' size=16><br>\n");
print("Korean UTF-8: <input name=Korean"
." value='$r_Korean' size=16><br>\n");
print("Chinese UTF-8: <input name=ChineseUtf8"
." value='$r_ChineseUtf8' size=16><br>\n");
print("Chinese GB2312: <input name=ChineseGb2312"
." value='$r_ChineseGb2312' size=16><br>\n");
print("<input type=submit name=submit value=Submit>\n");
print("</form>\n");
#- Outputing input strings back to HTML document
print("<hr>");
print("<pre>");
print("Input strings before decoding:\n");
foreach ($_GET as $k => $v) {
print "$k = ($v)\n";
}
print("</pre>");
#- Outputing input strings back to HTML document - decoded
print("<hr>");
print("<pre>");
print("Input strings after decoding:\n");
foreach ($_GET as $k => $v) {
print("$k = (".html_entity_decode($v,ENT_COMPAT,"UTF-8").")\n");
}
print("</pre>");
print("</body>");
print("</html>");
#- Dumping input strings to a file
$file = fopen("\\temp\\InputIsoGet.txt", 'ab');
$str = "------\n";
fwrite($file, $str, strlen($str));
if (array_key_exists('QUERY_STRING',$_SERVER)) {
$str = $_SERVER['QUERY_STRING'];
} else {
$str = NULL;
}
fwrite($file, $str, strlen($str));
$str = "------\n";
fwrite($file, $str, strlen($str));
foreach ($_REQUEST as $k => $v) {
$str = "$k = ($v)\n";
fwrite($file, $str, strlen($str));
}
fclose($file);
?>
(Continued on next part...)
Part:
1
2
3
4
5
6
7
|