This section describes an error case where a UTF-8 encoding page contains GB18030 characters.
The most common errors occur on Chinese Web pages are some characters using encodings different than the page encoding setting.
For example, a Web page is set with charset=utf-8. But some characters are entered in GB18030 encoding.
In this case, those GB18030 characters will not be displayed correctly.
To show you this problem, I created this test Web page.
The page is set with charset=utf-8 and most Chinese characters are entered in UTF-8 encoding.
But some Chinese characters are entered in GB18030 encoding.
<html>
<!-- Hello-UTF-8-Error.html
Copyright (c) 2007 by Dr. Herong Yang, http://www.herongyang.com/
-->
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<body>
<b>Chinese characters in UTF-8</b><br/>
Simplified characters: 简体中文网页<br/>
Traditional characters: 繁體中文網頁<br/>
<br/>
<b>Error: GB13080 characters included in a UTF-8 page</b><br/>
Simplified characters: ??????<br/>
</body>
</html>
As expected, this Web page, http://localhost/Hello-UTF-8-Error.html, does not
display those GB18030 characters correctly: