This section provides a tutorial example on how to detect the system default encoding with a PHP script that displays the same Chinese text in 4 different encodings.

If you are using a new system to process text files in Chinese characters, the first thing you need to do is to figure out what is the default encoding on the interface. Understanding the system default encoding will help you to detect the encoding used a in Chinese text file, convert its encoding, and fix its encoding issue.

One way to figure this out is to use the following simple PHP script, Chinese-Encoding-Test.php.

  print("<h4>Chinese text in different encodings</h4>\n");
  print("Unicode: ".$unicode."$\n");
  print("UTF-8: ".$utf8."$\n");
  print("GB18030: ".$gb18030."$\n");
  print("Big5: ".$big5."$\n");

Notes on this test script:

When you run this test PHP script on your system's terminal (or command window), you should see one of the encodings displaying correct Chinese text, if the console supports Chinese characters. Other encodings will display junk characters. That encoding is the default encoding of your system.

For example, the default encoding of my system is UTF-8. So I see the following output on my terminal:

If you see all encodings displaying junk characters, then your system does not support Chinese characters.

You can also run this test PHP script on a Web server to test Web browser encodings. Here is what I see on my Web browser with GBK (GB18030) encoding:

