Chinese Web Sites Using PHP - v2.24, by Herong Yang
Processing Web Form Input in Latin1 Encoding Error
This section provides a tutorial example demonstrating Web form with default text in Latin1 encoding, but the browser converts it to UTF-8 encoding, which results form data being submitted as UTF-8 encoding.
The next test I did was to try to enter Latin1 characters as Web form input. I wrote a new test PHP script with some interesting features:
<?php #- Web-Form-Input-Latin1.php #- Copyright (c) 2005 HerongYang.com. All Rights Reserved. # print('<html>'); print('<body>'."\n"); # Default input text $input = 'Télévision'; $input_hex = '54E96CE9766973696F6E'; # Form reply determination $reply = isset($_REQUEST["Submit"]); # Process form input data if ($reply) { if (isset($_REQUEST["Input"])) { $input = $_REQUEST["Input"]; } } # Display form print('<form>'); print('<input type="Text" size="40" maxlength="64"' . ' name="Input" value="'.$input.'"/><br/>'); print('<input type="Submit" name="Submit" value="Submit"/>'); print('</form>'."\n"); # Display reply if ($reply) { print('<pre>'."\n"); print('You have submitted:'."\n"); print(' Text = '.$input."\n"); print(' Text in HEX = '.strtoupper(bin2hex($input))."\n"); print(' Default HEX = '.$input_hex."\n"); print('</pre>'."\n"); } print('</body></html>'); ?>
After moving this PHP script file to Apache server document directory, I tested it with Internet Explorer (IE) with this URL: http://localhost/Web-Form-Input-Latin1.php. I saw a Web page with a form that has the suggested input text and a submit button.
The suggested Latin1 input characters was displayed correctly, even it was generated by my script as HTML entities.
After clicking the submit button, I saw a returning Web page with the same form and a reply section. However, the French text in hex I received from the form is not the same as the one I created as the default:
It looks like my French text started from Latin1 encoding and ended in UTF-8 encoding. For some reason the browser is able to manage this difference.
Text = Télévision
Text in HEX = 54C3A96CC3A9766973696F6E
Default HEX = 54E96CE9766973696F6E
It is interesting to note that the return Web page has a special URL which contains the input text inside the query string. All characters in the input text are ASCII characters except two, which are UTF-8 characters presented as Hex values in the URL.
http://localhost/Web-Form-Input-latin1.php ?Input=T%C3%A9l%C3%A9vision&Submit=Submit
See the next tutorial on how to troubleshoot and fix the issue.
Table of Contents
PHP Installation on Windows Systems
Integrating PHP with Apache Web Server
charset="*" - Encodings on Chinese Web Pages
Chinese Characters in PHP String Literals
Multibyte String Functions in UTF-8 Encoding
►Input Text Data from Web Forms
Processing Web Form Input in ASCII
►Processing Web Form Input in Latin1 Encoding Error
Processing Web Form Input in Latin1
Entering Latin1 Characters with Alt Keycodes
Testing Latin1 Alt Keycodes with IE
Processing Web Form Input in UTF-8
Outputting Form Default Input Text in UTF-8
Testing Alt Keycodes with IE on a UTF-8 Web Page
Input Chinese Text Data from Web Forms
MySQL - Installation on Windows
MySQL - Connecting PHP to Database
MySQL - Character Set and Encoding
MySQL - Sending Non-ASCII Text to MySQL
Retrieving Chinese Text from Database to Web Pages
Input Chinese Text Data to MySQL Database