Datatypes, Values and Representations

XSD Tutorials - Herong's Tutorial Examples - Version 5.10, by Dr. Herong Yang

XSD Tutorials - Herong's Tutorial Examples

∟Datatypes, Values and Representations

This section describes how built-in datatypes can be used to add type declarations to XML elements and attributes to help XML receiver to parse out desired values from XML documents.

There are 2 main purposes for providing XSD built-in datatypes:

Built-in datatypes can be used to declare an XML element or an XML attribute to be of a specific built-in datatype to help the XML document receiver to process it with better accuracy.
Built-in datatypes can be used to build additional new custom datatypes using <xs:simpleType> and <xs:complexType>.

In this section, we will concentrate on declare an XML element or an XML attribute with a built-in datatype. Building new custom datatypes based on built-in datatypes will be discussed in details in later sections.

First let's see an XML example without any datatype declaration:

<data>7065616365</data>

When the receiver process this XML element, he/she will face a problem, because the content could be evaluated in multiple ways: as an integer value of 7,065,616,365; as a phone number of (706) 561-6365; and as an ASCII string of "peace".

The above problem can be resolved, if the "data" element were declared a built-in datatype, because each built-in datatype provided precise rules on:

What values are supported by the datatype.
How each value can be represented in the XML document.
How each representation should be evaluated into a supported value.

To understand how built-in datatypes and their associated rules work, we need to introduce some terminologies:

Value Space - The value space of a datatype is the set of values that are supported by that datatype. I guess we can also call it the value set of a datatype.
Lexical Space - The lexical space of a datatype is the prescribed set of strings which can be mapped to values of that datatype. May be we can call it the representation set of a datatype. Each item in the set is an literal representation which can be parsed into a value in the value set.
Lexical Representation - A lexical representation is a single item in the Lexical Space.

With these terminologies, we can describe how XSD datatype can be used in processes of generating and processing XML documents as:

1. A user, the XML generator, has a value V he wants to communicate in an XML element E.

2. The XML generator finds that value V is in the value space of built-in datatype T.

3. The XML generator declares element E to be datatype T in the XSD document:

<xs:element name="E" type="xs:T"/>

4. The XML generator represents value V with a lexical representation R base on lexical space rules associated with datatype T and put it in element E in a XML document:

<A>E</A>

5. Another user, the XML receiver, receives the XSD document and the XML document.

6. The XML receiver parses the representation R based on lexical space rules associated with datatype T and retrieves the value V back.

The following diagram provides an illustration of how element E, datatype T, value V and representation R are related to each:
Datatype, Element, Value, Representation