Data Encodings - Herong's Tutorial Examples - Version 5.13, by Dr. Herong Yang
What Is URL/URI Encoding?
This section describes what is URL/URI encoding - an encoding schema used in URL/URI that encode data into a sequence of characters prefixed with a percenter sign (%).
What Is URL Encoding? URL Encoding is an encoding schema used in URL (Uniform Resource Locator) that encode data into a sequence of characters prefixed with a percenter sign (%).
URL Encoding is also called Percent Encoding, because it uses the percent sign to indicate an encoded sequence of character string.
URL Encoding is also called URI Encoding, because it is now applied to both subsets of the URI (Uniform Resource Identifier) set: (URL) Uniform Resource Locator and (URN) Uniform Resource Name.
According RFC 3986 - "Uniform Resource Identifier (URI): Generic Syntax", http://tools.ietf.org/html/rfc3986, the Percent Encoding following these rules:
1. The URI character set consists of two groups of characters: reserved characters and unreserved characters.
2. There are 18 reserved characters in the URI character set:
! * ' ( ) ; : @ & = + $ , / ? # [ ]
3. There are 66 unreserved characters in the URI character set:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 - _ . ~
4. When a reserved character is being used in a URI not for its reserved purpose, it needs to be encoded as %xx, where xx is a pair of hexadecimal digits representing that character.
5. When a character outside the URI character set is being used in a URI, it needs to be convert to a byte sequence using UTF-8 schema. Then each byte of the sequence is encoded as %xx, where xx is a pair of hexadecimal digits representing that byte.
Example: If I want to pass the string "how are you?" as the search key words to Google's search engine, I need to use Percent Encoding on the URL as:
Table of Contents