RFC 1421 - Privacy Enhancement for Email

This section provides a summary of 'RFC 1421 - Privacy Enhancement for Internet Electronic Mail: Part I: Message Encryption and Authentication Procedures' and how the Base64 encoding algorithm was finalized in this RFC.

As background information, let's take a look at the RFC (Request For comments) paper in which the Base64 encoding algorithm was finalized. In 1993, John Linn submitted "RFC 1421 - Privacy Enhancement for Internet Electronic Mail: Part I: Message Encryption and Authentication Procedures". In this RFC, John proposed a message encryption and authentication procedure in order to provide privacy-enhanced mail (PEM) services for electronic mail transfer in the Internet.

The propose procedure consists 4 main steps:

Step 1: Local Form - This step takes the email message text in the system's native character set, with lines delimited in accordance with local convention.

Step 2: Canonical Form - This step converts the local email message text to a universal canonical form, similar to the inter-SMTP representation as defined in RFC 821 and RFC 822. This step assures that the message text is represented with the ASCII character set and "<CR><LF>" line delimiters, but does not perform the dot-stuffing transformation.

Step 3: Authentication and Encryption - This step process the canonicalized email message text with selected MIC (Message Integrity Check) and Encryption algorithms as needed.

Step 4: Printable Encoding - This step encodes the encrypted email message text into characters which are universally representable (or printable) at all sites, though not necessarily with the same bit patterns (e.g., although the character "E" is represented in an ASCII-based system as hexadecimal 45 and as hexadecimal C5 in an EBCDIC-based system, the local significance of the two representations is equivalent).

It was in Step 4 that John proposed the Base64 encoding algorithm as described below in this RFC:

A 64-character subset of International Alphabet IA5 is used, enabling
6 bits to be represented per printable character.  (The proposed
subset of characters is represented identically in IA5 and ASCII.)
The character "=" signifies a special processing function used for
padding within the printable encoding procedure.

To represent the encapsulated text of a PEM message, the encoding
function's output is delimited into text lines (using local
conventions), with each line except the last containing exactly 64
printable characters and the final line containing 64 or fewer
printable characters.  (This line length is easily printable and is
guaranteed to satisfy SMTP's 1000-character transmitted line length
limit.) This folding requirement does not apply when the encoding
procedure is used to represent PEM header field quantities; Section
4.6 discusses folding of PEM encapsulated header fields.

The encoding process represents 24-bit groups of input bits as output
strings of 4 encoded characters. Proceeding from left to right across
a 24-bit input group extracted from the output of step 3, each 6-bit
group is used as an index into an array of 64 printable characters.
The character referenced by the index is placed in the output string.
These characters, identified in Table 1, are selected so as to be
universally representable, and the set excludes characters with
particular significance to SMTP (e.g., ".", "<CR>", "<LF>").

Special processing is performed if fewer than 24 bits are available
in an input group at the end of a message.  A full encoding quantum
is always completed at the end of a message.  When fewer than 24
input bits are available in an input group, zero bits are added (on
the right) to form an integral number of 6-bit groups.  Output
character positions which are not required to represent actual input
data are set to the character "=".  Since all canonically encoded
output is an integral number of octets, only the following cases can
arise: (1) the final quantum of encoding input is an integral
multiple of 24 bits; here, the final unit of encoded output will be
an integral multiple of 4 characters with no "=" padding, (2) the
final quantum of encoding input is exactly 8 bits; here, the final
unit of encoded output will be two characters followed by two "="
padding characters, or (3) the final quantum of encoding input is
exactly 16 bits; here, the final unit of encoded output will be three
characters followed by one "=" padding character.

Value Encoding  Value Encoding  Value Encoding  Value Encoding
    0 A            17 R            34 i            51 z
    1 B            18 S            35 j            52 0
    2 C            19 T            36 k            53 1
    3 D            20 U            37 l            54 2
    4 E            21 V            38 m            55 3
    5 F            22 W            39 n            56 4
    6 G            23 X            40 o            57 5
    7 H            24 Y            41 p            58 6
    8 I            25 Z            42 q            59 7
    9 J            26 a            43 r            60 8
   10 K            27 b            44 s            61 9
   11 L            28 c            45 t            62 +
   12 M            29 d            46 u            63 /
   13 N            30 e            47 v
   14 O            31 f            48 w         (pad) =
   15 P            32 g            49 x
   16 Q            33 h            50 y

               Printable Encoding Characters
                          Table 1

Notes on the John's RFC 1421:

1. On the sender's system, John's 4-step procedure to convert email messages from local form to transmit form can be summarized as:

Transmit_Form = Base64_Encode(Encrypt(Canonicalize(Local_Form)))

2. On the receiver's system, email messages need to be converted back in the reverse order:

Local_Form = DeCanonicalize(Decipher(Base64_Decode(Transmit_Form)))

3. John started with Base64 encoding output characters being any local bit representations like ASCII or EBCDIC. Then he changed to ASCII only later.

4. John proposed to fold the Base64 encoding output into 64 characters per line. This line length requirement makes the final email message guaranteed to satisfy SMTP's 1000-character transmitted line length limit and easily printable on printers.

5. John also proposed to apply the Base64 encoding algorithm to other MIC (Message Integrity Check) and Encryption related data elements to support PEM service.

6. John's RFC 1421 was a revised version of his RFC 1113 proposed in 1989, which was a revised version his RFC 1040 proposed in 1988, which was a revised version his RFC 989 proposed in 1987. The Base64 encoding algorithm was originally proposed in RFC 989 in 1987.

6. John's final version of the Base64 encoding algorithm proposed in RFC 1421 was identical to RFC 1113, RFC 1040, and RFC 989 with one change of removing the "*" mechanism for embedded clear text.

Here is an example of email message in transmit form with multiple Base64 encoded data elements taken from this RFC:

Proc-Type: 4,ENCRYPTED
Content-Domain: RFC822
DEK-Info: DES-CBC,BFF968AA74691AC1
Key-Info: RSA,
Key-Info: RSA,


For more information, visit RFC 1421 Websit at https://tools.ietf.org/html/rfc1421.

Table of Contents

 About This Book

Base64 Encoding

 Base64 Encoding Algorithm

RFC 1421 - Privacy Enhancement for Email

 RFC 1521 - MIME (Multipurpose Internet Mail Extensions)

 W3C Implementation of Base64 in Java

 Sun Implementation of Base64 in Java

 Sun Implementation of Base64 in Java - Test

 Goetz' Implementation of Base64 in JavaScript

 Goetz' Implementation of Base64 in JavaScript - Test

 Base64 Encoding and Decoding Tools

 Base64URL - URL Safe Base64 Encoding

 Base32 Encoding

 URL Encoding, URI Encoding, or Percent Encoding

 UUEncode Encoding


 Full Version in PDF/EPUB