Data Encoding Tutorials - Herong's Tutorial Examples - v5.23, by Herong Yang
RFC 1421 - Privacy Enhancement for Email
This section provides a summary of 'RFC 1421 - Privacy Enhancement for Internet Electronic Mail: Part I: Message Encryption and Authentication Procedures' and how the Base64 encoding algorithm was finalized in this RFC.
As background information, let's take a look at the RFC (Request For comments) paper in which the Base64 encoding algorithm was finalized. In 1993, John Linn submitted "RFC 1421 - Privacy Enhancement for Internet Electronic Mail: Part I: Message Encryption and Authentication Procedures". In this RFC, John proposed a message encryption and authentication procedure in order to provide privacy-enhanced mail (PEM) services for electronic mail transfer in the Internet.
The propose procedure consists 4 main steps:
Step 1: Local Form - This step takes the email message text in the system's native character set, with lines delimited in accordance with local convention.
Step 2: Canonical Form - This step converts the local email message text to a universal canonical form, similar to the inter-SMTP representation as defined in RFC 821 and RFC 822. This step assures that the message text is represented with the ASCII character set and "<CR><LF>" line delimiters, but does not perform the dot-stuffing transformation.
Step 3: Authentication and Encryption - This step process the canonicalized email message text with selected MIC (Message Integrity Check) and Encryption algorithms as needed.
Step 4: Printable Encoding - This step encodes the encrypted email message text into characters which are universally representable (or printable) at all sites, though not necessarily with the same bit patterns (e.g., although the character "E" is represented in an ASCII-based system as hexadecimal 45 and as hexadecimal C5 in an EBCDIC-based system, the local significance of the two representations is equivalent).
It was in Step 4 that John proposed the Base64 encoding algorithm as described below in this RFC:
A 64-character subset of International Alphabet IA5 is used, enabling 6 bits to be represented per printable character. (The proposed subset of characters is represented identically in IA5 and ASCII.) The character "=" signifies a special processing function used for padding within the printable encoding procedure. To represent the encapsulated text of a PEM message, the encoding function's output is delimited into text lines (using local conventions), with each line except the last containing exactly 64 printable characters and the final line containing 64 or fewer printable characters. (This line length is easily printable and is guaranteed to satisfy SMTP's 1000-character transmitted line length limit.) This folding requirement does not apply when the encoding procedure is used to represent PEM header field quantities; Section 4.6 discusses folding of PEM encapsulated header fields. The encoding process represents 24-bit groups of input bits as output strings of 4 encoded characters. Proceeding from left to right across a 24-bit input group extracted from the output of step 3, each 6-bit group is used as an index into an array of 64 printable characters. The character referenced by the index is placed in the output string. These characters, identified in Table 1, are selected so as to be universally representable, and the set excludes characters with particular significance to SMTP (e.g., ".", "<CR>", "<LF>"). Special processing is performed if fewer than 24 bits are available in an input group at the end of a message. A full encoding quantum is always completed at the end of a message. When fewer than 24 input bits are available in an input group, zero bits are added (on the right) to form an integral number of 6-bit groups. Output character positions which are not required to represent actual input data are set to the character "=". Since all canonically encoded output is an integral number of octets, only the following cases can arise: (1) the final quantum of encoding input is an integral multiple of 24 bits; here, the final unit of encoded output will be an integral multiple of 4 characters with no "=" padding, (2) the final quantum of encoding input is exactly 8 bits; here, the final unit of encoded output will be two characters followed by two "=" padding characters, or (3) the final quantum of encoding input is exactly 16 bits; here, the final unit of encoded output will be three characters followed by one "=" padding character. Value Encoding Value Encoding Value Encoding Value Encoding 0 A 17 R 34 i 51 z 1 B 18 S 35 j 52 0 2 C 19 T 36 k 53 1 3 D 20 U 37 l 54 2 4 E 21 V 38 m 55 3 5 F 22 W 39 n 56 4 6 G 23 X 40 o 57 5 7 H 24 Y 41 p 58 6 8 I 25 Z 42 q 59 7 9 J 26 a 43 r 60 8 10 K 27 b 44 s 61 9 11 L 28 c 45 t 62 + 12 M 29 d 46 u 63 / 13 N 30 e 47 v 14 O 31 f 48 w (pad) = 15 P 32 g 49 x 16 Q 33 h 50 y Printable Encoding Characters Table 1
Notes on the John's RFC 1421:
1. On the sender's system, John's 4-step procedure to convert email messages from local form to transmit form can be summarized as:
Transmit_Form = Base64_Encode(Encrypt(Canonicalize(Local_Form)))
2. On the receiver's system, email messages need to be converted back in the reverse order:
Local_Form = DeCanonicalize(Decipher(Base64_Decode(Transmit_Form)))
3. John started with Base64 encoding output characters being any local bit representations like ASCII or EBCDIC. Then he changed to ASCII only later.
4. John proposed to fold the Base64 encoding output into 64 characters per line. This line length requirement makes the final email message guaranteed to satisfy SMTP's 1000-character transmitted line length limit and easily printable on printers.
5. John also proposed to apply the Base64 encoding algorithm to other MIC (Message Integrity Check) and Encryption related data elements to support PEM service.
6. John's RFC 1421 was a revised version of his RFC 1113 proposed in 1989, which was a revised version his RFC 1040 proposed in 1988, which was a revised version his RFC 989 proposed in 1987. The Base64 encoding algorithm was originally proposed in RFC 989 in 1987.
6. John's final version of the Base64 encoding algorithm proposed in RFC 1421 was identical to RFC 1113, RFC 1040, and RFC 989 with one change of removing the "*" mechanism for embedded clear text.
Here is an example of email message in transmit form with multiple Base64 encoded data elements taken from this RFC:
-----BEGIN PRIVACY-ENHANCED MESSAGE----- Proc-Type: 4,ENCRYPTED Content-Domain: RFC822 DEK-Info: DES-CBC,BFF968AA74691AC1 Originator-Certificate: MIIBlTCCAScCAWUwDQYJKoZIhvcNAQECBQAwUTELMAkGA1UEBhMCVVMxIDAeBgNV BAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMQ8wDQYDVQQLEwZCZXRhIDExDzAN BgNVBAsTBk5PVEFSWTAeFw05MTA5MDQxODM4MTdaFw05MzA5MDMxODM4MTZaMEUx CzAJBgNVBAYTAlVTMSAwHgYDVQQKExdSU0EgRGF0YSBTZWN1cml0eSwgSW5jLjEU MBIGA1UEAxMLVGVzdCBVc2VyIDEwWTAKBgRVCAEBAgICAANLADBIAkEAwHZHl7i+ yJcqDtjJCowzTdBJrdAiLAnSC+CnnjOJELyuQiBgkGrgIh3j8/x0fM+YrsyF1u3F LZPVtzlndhYFJQIDAQABMA0GCSqGSIb3DQEBAgUAA1kACKr0PqphJYw1j+YPtcIq iWlFPuN5jJ79Khfg7ASFxskYkEMjRNZV/HZDZQEhtVaU7Jxfzs2wfX5byMp2X3U/ 5XUXGx7qusDgHQGs7Jk9W8CW1fuSWUgN4w== Key-Info: RSA, I3rRIGXUGWAF8js5wCzRTkdhO34PTHdRZY9Tuvm03M+NM7fx6qc5udixps2Lng0+ wGrtiUm/ovtKdinz6ZQ/aQ== Issuer-Certificate: MIIB3DCCAUgCAQowDQYJKoZIhvcNAQECBQAwTzELMAkGA1UEBhMCVVMxIDAeBgNV BAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMQ8wDQYDVQQLEwZCZXRhIDExDTAL BgNVBAsTBFRMQ0EwHhcNOTEwOTAxMDgwMDAwWhcNOTIwOTAxMDc1OTU5WjBRMQsw CQYDVQQGEwJVUzEgMB4GA1UEChMXUlNBIERhdGEgU2VjdXJpdHksIEluYy4xDzAN BgNVBAsTBkJldGEgMTEPMA0GA1UECxMGTk9UQVJZMHAwCgYEVQgBAQICArwDYgAw XwJYCsnp6lQCxYykNlODwutF/jMJ3kL+3PjYyHOwk+/9rLg6X65B/LD4bJHtO5XW cqAz/7R7XhjYCm0PcqbdzoACZtIlETrKrcJiDYoP+DkZ8k1gCk7hQHpbIwIDAQAB MA0GCSqGSIb3DQEBAgUAA38AAICPv4f9Gx/tY4+p+4DB7MV+tKZnvBoy8zgoMGOx dD2jMZ/3HsyWKWgSF0eH/AJB3qr9zosG47pyMnTf3aSy2nBO7CMxpUWRBcXUpE+x EREZd9++32ofGBIXaialnOgVUn0OzSYgugiQ077nJLDUj0hQehCizEs5wUJ35a5h MIC-Info: RSA-MD5,RSA, UdFJR8u/TIGhfH65ieewe2lOW4tooa3vZCvVNGBZirf/7nrgzWDABz8w9NsXSexv AjRFbHoNPzBuxwmOAFeA0HJszL4yBvhG Recipient-ID-Asymmetric: MFExCzAJBgNVBAYTAlVTMSAwHgYDVQQKExdSU0EgRGF0YSBTZWN1cml0eSwgSW5j LjEPMA0GA1UECxMGQmV0YSAxMQ8wDQYDVQQLEwZOT1RBUlk=, 66 Key-Info: RSA, O6BS1ww9CTyHPtS3bMLD+L0hejdvX6Qv1HK2ds2sQPEaXhX8EhvVphHYTjwekdWv 7x0Z3Jx2vTAhOYHMcqqCjA== qeWlj/YJ2Uf5ng9yznPbtD0mYloSwIuV9FRYx+gzY+8iXd/NQrXHfi6/MhPfPF3d jIqCJAxvld2xgqQimUzoS1a4r7kQQ5c/Iua4LqKeq3ciFzEv/MbZhA== -----END PRIVACY-ENHANCED MESSAGE-----
For more information, visit RFC 1421 Website at https://tools.ietf.org/html/rfc1421.
Table of Contents
►RFC 1421 - Privacy Enhancement for Email
RFC 1521 - MIME (Multipurpose Internet Mail Extensions)
W3C Implementation of Base64 in Java
Sun Implementation of Base64 in Java
Sun Implementation of Base64 in Java - Test
Goetz' Implementation of Base64 in JavaScript
Goetz' Implementation of Base64 in JavaScript - Test
Base64 Encoding and Decoding Tools
Base64URL - URL Safe Base64 Encoding