Data Encodings - Herong's Tutorial Examples
Dr. Herong Yang, Version 5.10

UUEnccode Algorithm

This section describes the UUEncode algorithm with some simple encoding examples.

UUEncode (Unix-to-Unix Encoding) was designed to address the problem of sending binary data file by email. It converts any data file to a text file with only printable characters.

UUEncode was very useful for email users in the early days, when email attachment (MIME protocol) was not available yet. For example, if I want to send a text message in Chinese GB coding to a friend, I can not include the GB codes directly in the email body. I need to uuencode (UUEncode encoding command) the GB codes into printable characters. Then copy those characters into the email body. When my friend receives this email, he/she need to uudecode (UUEncode decoding command) the printable characters back to the original GB codes to read the text message in Chinese.

The encoding process is to:

  • Divide the input bytes stream into blocks of 3 bytes.
  • Divide the 24 bits of a 3-byte block into 4 groups of 6 bits.
  • Expand each group of 6 bits to 8 bits and add 32, \x20, so the resulting bit map is representing an ASCII printable character.
  • If the last 3-byte block has only 1 byte of input data, pad 2 bytes of 1 (\x0101).
  • If the last 3-byte block has only 2 bytes of input data, pad 1 byte of 1 (\x01).

The printable characters used by UUEncode encoding are listed in the following table:

32     33 !   34 "   35 #   36 $   37 %   38 &   39 '
40 (   41 )   42 *   43 +   44 ,   45 -   46 .   47 /
48 0   49 1   50 2   51 3   52 4   53 5   54 6   55 7
56 8   57 9   58 :   59 ;   60 <   61 =   62 >   63 ?
64 @   65 A   66 B   67 C   68 D   69 E   70 F   71 G
72 H   73 I   74 J   75 K   76 L   77 M   78 N   79 O
80 P   81 Q   82 R   83 S   84 T   85 U   86 V   87 W
88 X   89 Y   90 Z   91 [   92 \   93 ]   94 ^   95 _

Example 1: Input data, 1 byte, "A". Encoded output, 4 characters, "00$!"

Input Data          A
Input Bits   01000001
Padding      01000001 00000001 00000001
                   \      \      \
Bit Groups   010000 010000 000100 000001
Adding 32    100000 100000 100000 100001
             110000 110000 100100 100001
Output            0      0      $      !     

Example 2: Input data, 2 bytes, "AB". Encoded output, 4 characters, "04(!"

Input Data          A        B
Input Bits   01000001 01000010
Padding      01000001 01000010 00000001
                   \      \      \
Bit Groups   010000 010100 001000 000000
Adding 32    100000 100000 100000 100000
             110000 110100 101000 100001
Output            0      4      (      !

Example 3: Input data, 3 bytes, "ABC". Encoded output, 4 characters, "04)#"

Input Data          A        B        C
Input Bits   01000001 01000010 01000011
                   \      \      \
Bit Groups   010000 010100 001001 000011
Adding 32    100000 100000 100000 100000
             110000 110100 101001 100011
Output            0      4      )      #

Encoding output file formatting rules:

  • First line must be: "begin ooo filename", where "ooo" is the Unix file access mode code, and "filename" is the file name of the input data file.
  • Encoded output characters will be grouped lines with 60 characters per line.
  • A counter byte is inserted at the beginning of each line. It records the number of input data bytes encoded in this line. A value of 32, \x20, is added to this byte, so it becomes a printable character.
  • For a line of full 60 output characters, the leading counter byte will be "M", because there are 45 input bytes, plus 32, resulting 77, which is the ASCII value of "M". So you will see "M" in all the output lines except for the last line, which will have a smaller value, if the number of input bytes is less than 45.
  • Two extra lines are used to end the output file. The first line has a single byte of \x20. The second line has "end".

Table of Contents

 About This Book

UUEncode Encoding

UUEnccode Algorithm

 Sun Implementation of UUEnccode in Java

 Correction to Sun Implementation of UUEnccode

 PHP - convert_uuencode() and convert_uudecode()

 Base64 Encoding

 Base32 Encoding

 URL Encoding, URI Encoding, or Percent Encoding

 References

 PDF Printing Version

Dr. Herong Yang, updated in 2010
UUEnccode Algorithm