This section describes how to display a Web form and process form input data in UTF-8.
The next test I did was to try to enter UTF-8 characters as Web form input.
I wrote a new test PHP script with some interesting features:
A HTML header tag <meta> is used to set the Web page with charset=utf-8
for UTF-8 encoding.
A default input text is provided with a French word in UTF-8 encoding.
To avoid any encoding conversion, I used HTML entity format to provide the UTF-8
encoded bytes in Hex values. Note that the special French character is encoded in two bytes.
The received text from the $_REQUEST array is displayed back on the returning Web page
as encoded characters. It is also displayed in Hex values to compare with the HEX values
of the default text.
<?php #Web-Form-Input-UTF8.php
# Copyright (c) 2007 by Dr. Herong Yang, http://www.herongyang.com/
#
print('<html><head>');
print('<meta http-equiv="Content-Type"'.
' content="text/html; charset=utf-8"/>');
print('</head><body>'."\n");
# Default input text
$input =
'Télévis'
.'ion';
$input_hex = '54C3A96CC3A9766973696F6E';
# Form reply determination
$reply = isset($_REQUEST["Submit"]);
# Process form input data
if ($reply) {
if (isset($_REQUEST["Input"])) {
$input = $_REQUEST["Input"];
}
}
# Display form
print('<form>');
print('<input type="Text" size="40" maxlength="64"'
. ' name="Input" value="'.$input.'"/><br/>');
print('<input type="Submit" name="Submit" value="Submit"/>');
print('</form>'."\n");
# Display reply
if ($reply) {
print('<pre>'."\n");
print('Content-Type:'."\n");
print(' text/html; charset=utf-8'."\n");
print('You have submitted:'."\n");
print(' Text = '.$input."\n");
print(' Text in HEX = '.strtoupper(bin2hex($input))."\n");
print(' Default HEX = '.$input_hex."\n");
print('</pre>'."\n");
}
print('</body></html>');
?>
After moving this PHP script file to Apache server document directory, I tested it with Internet Explorer (IE)
with this URL: http://localhost/Web-Form-Input-UTF8.php. I saw a Web page with a form that has the
suggested input text and a submit button.
However, the French characters in the default text encoded in UTF-8 was not displayed correctly.
After clicking the submit button, I saw a returning Web page with the same form and a reply section.
Since the default text was not displayed correctly, the PHP received incorrect UTF-8 byte sequences:
It is interesting to note that the return Web page has a special URL which
contains the input text inside the query string.
The special characters are included as Hex values of UTF-8 byte sequences: