Generate 8-Bit Encoding Tables

This section provides a tutorial example on how to generate 8-bit encoding tables with a PHP script.

Since 8-bit encodings play an important role in generating corrupted Chinese text, I decide to write a PHP script to generate the encoding table of a given encoding name.

<?php 
#- 8-Bit-Encoding-Table.php
#- Copyright (c) 2005 HerongYang.com. All Rights Reserved.
    
$encoding = $argv[1];		
$table = "   0123456789abcdef\n";
for ($i=0; $i<16; $i++) {
  $line = "";
  for ($j=0; $j<16; $j++) {
    $code = dechex($i).dechex($j);
    if ($i==0 || $i==1) $code = "00";
    $line .= $code;
  }
  $table .= dechex($i)."x ".hex2bin($line)."\n";
}
$encoded = iconv($encoding, "UTF-8//IGNORE", $table);
print($encoded);
?>

This script, 8-Bit-Encoding-Table.php, uses a nested loop to build a 8-bit byte table. The first 2 lines are kept empty to avoid control control characters. The iconv() function is used generate the final encoding table using "UTF-8" as the presentation encoding for my macOS computer.

8-Bit-Encoding-Table.php produce the output table for any encoding supported by the "iconv" command as shown below:

herong$ iconv -l | more 
ANSI_X3.4-1968 ANSI_X3.4-1986 ASCII CP367 IBM367 ISO-IR-6 ISO646-US ...
UTF-8 UTF8
UTF-8-MAC UTF8-MAC
ISO-10646-UCS-2 UCS-2 CSUNICODE
UCS-2BE UNICODE-1-1 UNICODEBIG CSUNICODE11
UCS-2LE UNICODELITTLE
ISO-10646-UCS-4 UCS-4 CSUCS4
UCS-4BE
UCS-4LE
UTF-16
UTF-16BE
UTF-16LE
UTF-32
UNICODE-1-1-UTF-7 UTF-7 CSUNICODE11UTF7
UCS-2-INTERNAL
UCS-2-SWAPPED
UCS-4-INTERNAL
UCS-4-SWAPPED
C99
CP819 IBM819 ISO-8859-1 ISO-IR-100 ISO8859-1 ISO_8859-1 ISO_8859-1:1987 ...
ECMA-118 ELOT_928 GREEK GREEK8 ISO-8859-7 ISO-IR-126 ISO8859-7 ...
ISO-8859-15 ISO-IR-203 ISO8859-15 ISO_8859-15 ISO_8859-15:1998 LATIN-9
CP1250 MS-EE WINDOWS-1250
CP1251 MS-CYRL WINDOWS-1251
CP1252 MS-ANSI WINDOWS-1252
CP1253 MS-GREEK WINDOWS-1253
CP1254 MS-TURK WINDOWS-1254
CP1255 MS-HEBR WINDOWS-1255
CP1256 MS-ARAB WINDOWS-1256
CP1257 WINBALTRIM WINDOWS-1257
CP1258 WINDOWS-1258
850 CP850 IBM850 CSPC850MULTILINGUAL
862 CP862 IBM862 CSPC862LATINHEBREW
866 CP866 IBM866 CSIBM866
...

The picture below shows the encoding table for Extended ASCII or CP437 (IBM437) by running "php 8-Bit-Encoding-Table.php CP437":

8-Bit Encoding Table - Extended ASCII or CP437
8-Bit Encoding Table - Extended ASCII or CP437

The picture belows shows the encoding table for ISO-8859-1 or Latin-1 by running "php 8-Bit-Encoding-Table.php ISO-8859-1":

8-Bit Encoding Table - ISO-8859-1 or Latin-1
8-Bit Encoding Table - ISO-8859-1 or Latin-1

Table of Contents

 About This Book

 PHP Installation on Windows Systems

 Integrating PHP with Apache Web Server

 charset="*" - Encodings on Chinese Web Pages

 Chinese Characters in PHP String Literals

 Multibyte String Functions in UTF-8 Encoding

 Input Text Data from Web Forms

 Input Chinese Text Data from Web Forms

 MySQL - Installation on Windows

 MySQL - Connecting PHP to Database

 MySQL - Character Set and Encoding

 MySQL - Sending Non-ASCII Text to MySQL

 Retrieving Chinese Text from Database to Web Pages

 Input Chinese Text Data to MySQL Database

Chinese Text Encoding Conversion and Corruptions

 Detect System Default Encoding

 Root Cause of Corrupted Chinese Text

 Corrupted Chinese File Name with Un-ZIP

Generate 8-Bit Encoding Tables

 Restore Corrupted Chinese Text

 Encoding-Convertor.php - Encoding Conversion Test

 Archived Tutorials

 References

 Full Version in PDF/EPUB