Building Chinese Web Sites using PHP - Version 2.13, by Dr. Herong Yang

UTF-8 Encoding Pages with Big5 Characters

This section describes an error case where a UTF-8 encoding page contains Big5 character strings.

The most common errors occur on Chinese Web pages generated from PHP scripts are some character strings using encodings different than the page encoding setting. For example, a PHP sets the output Web page with charset=utf-8. But some character strings are entered in Big5 encoding. In this case, those Big5 characters will not be displayed correctly.

To show you this problem, I created this test PHP script. The output Web page is set with charset=utf-8 and most Chinese characters are entered in UTF-8 encoding. But some Chinese characters are entered in Big5 encoding.

#- String-UTF-8-Error.php
  $help_simplified = '这是一份非常简单的说明书…';
  $help_tradition = '這是一份非常簡單的說明書…';
  $help_big5 = '?????????????';
  print('<meta http-equiv="Content-Type"'.
    ' content="text/html; charset=utf-8"/>');
  print('<b>Chinese string in UTF-8 in PHP</b><br/>');
  print('<b>Big5 string included in a UTF-8 page</b><br/>');

As expected, this Web page, http://localhost/String-UTF-8-Error.html, does not display those Big5 characters correctly:
Chinese Web Page using UTF-8 with Big5 Characters

Last update: 2015.

