MD5 Message Digest Algorithm Overview

This section describes the MD5 algorithm - a 5-step process of padding of '1000...', appending message length, dividing as 512-bit blocks, initializing 4 buffers, and 4-round of hashing each block.

MD5 algorithm is well described in "RFC 1321 - The MD5 Message-Digest Algorithm" at http://www.ietf.org/rfc/rfc1321.txt. Below is a quick overview of the algorithm.

MD5 algorithm consists of 5 steps:

Step 1. Appending Padding Bits. The original message is "padded" (extended) so that its length (in bits) is congruent to 448, modulo 512. The padding rules are:

• The original message is always padded with one bit "1" first.
• Then zero or more bits "0" are padded to bring the length of the message up to 64 bits fewer than a multiple of 512.

Step 2. Appending Length. 64 bits are appended to the end of the padded message to indicate the length of the original message in bytes. The rules of appending length are:

• The length of the original message in bytes is converted to its binary format of 64 bits. If overflow happens, only the low-order 64 bits are used.
• Break the 64-bit length into 2 words (32 bits each).
• The low-order word is appended first and followed by the high-order word.

Step 3. Initializing MD Buffer. MD5 algorithm requires a 128-bit buffer with a specific initial value. The rules of initializing buffer are:

• The buffer is divided into 4 words (32 bits each), named as A, B, C, and D.
• Word A is initialized to: 0x67452301.
• Word B is initialized to: 0xEFCDAB89.
• Word C is initialized to: 0x98BADCFE.
• Word D is initialized to: 0x10325476.

Step 4. Processing Message in 512-bit Blocks. This is the main step of MD 5 algorithm, which loops through the padded and appended message in blocks of 512 bits each. For each input block, 4 rounds of operations are performed with 16 operations in each round. This step can be described in the following pseudo code slightly modified from the RFC 1321's version:

```Input and predefined functions:
A, B, C, D: initialized buffer words

F(X,Y,Z) = (X AND Y ) OR (NOT X AND Z)
G(X,Y,Z) = (X AND Z ) OR (Y AND NOT Z)
H(X,Y,Z) = X XOR Y XOR Z
I(X,Y,Z) = Y XOR (X OR NOT Z)

T[1, 2, ..., 64]: Array of special constants (32-bit integers) as:
T[i] = int(abs(sin(i)) * 2**32)

M[1, 2, ..., N]: Blocks of the padded and appended message

R1(a,b,c,d,X,s,i): Round 1 operation defined as:
a = b + ((a + F(b,c,d) + X + T[i]) <<< s)

R2(a,b,c,d,X,s,i): Round 1 operation defined as:
a = b + ((a + G(b,c,d) + X + T[i]) <<< s)

R3(a,b,c,d,X,s,i): Round 1 operation defined as:
a = b + ((a + H(b,c,d) + X + T[i]) <<< s)

R4(a,b,c,d,X,s,i): Round 1 operation defined as:
a = b + ((a + I(b,c,d) + X + T[i]) <<< s)

Algorithm:
For k = 1 to N do the following

AA = A
BB = B
CC = C
DD = D
(X, X, ..., X) = M[k] /* Divide M[k] into 16 words */

/* Round 1. Do 16 operations. */
R1(A,B,C,D,X[ 0], 7, 1)
R1(D,A,B,C,X[ 1],12, 2)
R1(C,D,A,B,X[ 2],17, 3)
R1(B,C,D,A,X[ 3],22, 4)
R1(A,B,C,D,X[ 4], 7, 5)
R1(D,A,B,C,X[ 5],12, 6)
R1(C,D,A,B,X[ 6],17, 7)
R1(B,C,D,A,X[ 7],22, 8)
R1(A,B,C,D,X[ 8], 7, 9)
R1(D,A,B,C,X[ 9],12,10)
R1(C,D,A,B,X,17,11)
R1(B,C,D,A,X,22,12)
R1(A,B,C,D,X, 7,13)
R1(D,A,B,C,X,12,14)
R1(C,D,A,B,X,17,15)
R1(B,C,D,A,X,22,16)

/* Round 2. Do 16 operations. */
R2(A,B,C,D,X[ 1], 5,17)
R2(D,A,B,C,X[ 6], 9,18)
R2(C,D,A,B,X,14,19)
R2(B,C,D,A,X[ 0],20,20)
R2(A,B,C,D,X[ 5], 5,21)
R2(D,A,B,C,X, 9,22)
R2(C,D,A,B,X,14,23)
R2(B,C,D,A,X[ 4],20,24)
R2(A,B,C,D,X[ 9], 5,25)
R2(D,A,B,C,X, 9,26)
R2(C,D,A,B,X[ 3],14,27)
R2(B,C,D,A,X[ 8],20,28)
R2(A,B,C,D,X, 5,29)
R2(D,A,B,C,X[ 2], 9,30)
R2(C,D,A,B,X[ 7],14,31)
R2(B,C,D,A,X,20,32)

/* Round 3. Do 16 operations. */
R3(A,B,C,D,X[ 5], 4,33)
R3(D,A,B,C,X[ 8],11,34)
R3(C,D,A,B,X,16,35)
R3(B,C,D,A,X,23,36)
R3(A,B,C,D,X[ 1], 4,37)
R3(D,A,B,C,X[ 4],11,38)
R3(C,D,A,B,X[ 7],16,39)
R3(B,C,D,A,X,23,40)
R3(A,B,C,D,X, 4,41)
R3(D,A,B,C,X[ 0],11,42)
R3(C,D,A,B,X[ 3],16,43)
R3(B,C,D,A,X[ 6],23,44)
R3(A,B,C,D,X[ 9], 4,45)
R3(D,A,B,C,X,11,46)
R3(C,D,A,B,X,16,47)
R3(B,C,D,A,X[ 2],23,48)

/* Round 4. Do 16 operations. */
R4(A,B,C,D,X[ 0], 6,49)
R4(D,A,B,C,X[ 7],10,50)
R4(C,D,A,B,X,15,51)
R4(B,C,D,A,X[ 5],21,52)
R4(A,B,C,D,X, 6,53)
R4(D,A,B,C,X[ 3],10,54)
R4(C,D,A,B,X,15,55)
R4(B,C,D,A,X[ 1],21,56)
R4(A,B,C,D,X[ 8], 6,57)
R4(D,A,B,C,X,10,58)
R4(C,D,A,B,X[ 6],15,59)
R4(B,C,D,A,X,21,60)
R4(A,B,C,D,X[ 4], 6,61)
R4(D,A,B,C,X,10,62)
R4(C,D,A,B,X[ 2],15,63)
R4(B,C,D,A,X[ 9],21,64)

A = A + AA
B = B + BB
C = C + CC
D = D + DD
End of for loop

Output:
A, B, C, D: Message digest
```

Step 5. Output. The contents in buffer words A, B, C, D are returned in sequence with low-order byte first.