|
Managing Non ASCII Character Strings
Part:
1
2
3
4
5
(Continued from previous part...)
For approach #1, you need turn off HTTP input and output encoding conversion by these php.ini settings:
mbstring.language = Neutral
mbstring.internal_encoding = UTF-8
mbstring.http_input = pass
mbstring.http_output = pass
mbstring.encoding_translation = Off
While writing your script, you must always remember that you are dealing with UTF-8 encoded strings.
Approach #2 is useful, if you want your Web page to be GB2312 encoded while using UTF-8 as your script
internal encoding, and you want your script to control the HTTP input and output conversion process.
Here are the php.ini settings:
mbstring.language = Neutral
mbstring.internal_encoding = UTF-8
mbstring.http_input = pass
mbstring.http_output = pass
mbstring.encoding_translation = Off
Approach #3 is useful, if you want your Web page to be UTF-8 encoded while using UTF-16 as your script
internal encoding, and you trust the PHP engine to do HTTP input and output encoding conversion.
Here are the php.ini settings:
mbstring.language = Neutral
mbstring.internal_encoding = UTF-8
mbstring.http_input = GB2312
mbstring.http_output = GB2312
mbstring.encoding_translation = On
Since approach #2 is more challenging than the others, I wrote the following script to give you
some ideas:
<?php # MbStringHttp.php
# Copyright (c) 2006 by Dr. Herong Yang, http://www.herongyang.com/
#
mb_internal_encoding("UTF-8");
#- Taking care of HTTP input conversion
$myRequest['English'] = "";
$myRequest['ChineseUtf8'] = "";
$myRequest['ChineseGb2312'] = "";
foreach ($_REQUEST as $k => $v) {
$myRequest[$k] = mb_convert_encoding($v,"UTF-8", "GB2312");
}
$r_English = $myRequest['English'];
$r_ChineseUtf8 = $myRequest['ChineseUtf8'];
$r_ChineseGb2312 = $myRequest['ChineseGb2312'];
#- Taking care of HTTP output conversion
mb_http_output("GB2312");
ob_start("mb_output_handler");
#- Generating HTML document
print("<html>");
print('<meta http-equiv="Content-Type"'
.' content="text/html; charset=gb2312"/>');
print("<body>\n");
print("<form action=MbStringHttp.php method=get>");
print("English ASCII: <input name=English"
." value='$r_English' size=16><br>\n");
print("Chinese UTF-8: <input name=ChineseUtf8"
." value='$r_ChineseUtf8' size=16><br>\n");
print("Chinese GB2312: <input name=ChineseGb2312"
." value='$r_ChineseGb2312' size=16><br>\n");
print("<input type=submit name=submit value=Submit>\n");
print("</form>\n");
#- Outputing input strings back to HTML document
print("<hr>");
print("<pre>");
print("{$myRequest['English']}\n");
print("{$myRequest['ChineseUtf8']}\n");
print("{$myRequest['ChineseGb2312']}\n");
print("</pre>");
print("</body>");
print("</html>");
#- Dumping input strings to a file
$file = fopen("\\temp\\MbStringHttp.txt", 'ab');
$str = "--- Query String ---\n";
fwrite($file, $str, strlen($str));
if (array_key_exists('QUERY_STRING',$_SERVER)) {
$str = $_SERVER['QUERY_STRING'];
} else {
$str = NULL;
}
fwrite($file, $str, strlen($str));
$str = "--- Raw reqeust input ---\n";
fwrite($file, $str, strlen($str));
foreach ($_REQUEST as $k => $v) {
$str = "$k = ($v)\n";
fwrite($file, $str, strlen($str));
}
$str = "--- Converted reqeust input ---\n";
fwrite($file, $str, strlen($str));
foreach ($myRequest as $k => $v) {
$str = "$k = ($v)\n";
fwrite($file, $str, strlen($str));
}
fclose($file);
?>
(Continued on next part...)
Part:
1
2
3
4
5
|