This section describes the most commonly used built-in datatype, 'string'. Whitespace characters are preserved in 'string' values. But XML entity references are parsed in 'string' lexical representations.
The "string" datatype and its derived datatypes are the most commonly used built-in datatypes
in XML documents. Let's take a closer look at the "string" datatype first.
"string" is a primitive datatype with a value set of all possible sequences of Unicode characters.
A "string" value can be expressed in an XML document using a sequence of characters.
Parsed XML entity references are allowed in "string" lexical representations.
But they will be parsed to obtain final "string" values.
For example, 3 XML elements below are all valid and represent the same "string" value:
Another note on "string" values is that
whitespace characters, '\t', '\r', '\n' and ' ', are preserved in "string" values.
For example, 3 XML elements below are all valid and represent 3 different "string" values:
Here is a sample XML document that contains <String> elements to test this XSD document:
<?xml version="1.1"?>
<!-- string_datatype_test.xml
- Copyright (c) 2013, HerongYang.com, All Rights Reserved.
-->
<String_Datatype_Test>
<!-- 3 valid "string" elements represent the same value -->
<String>PI > 3.14159</String>
<String>PI > 3.14159</String>
<String><![CDATA[PI > 3.14159]]></String>
<!-- 3 valid "string" elements represent different values -->
<String>Herong Yang</String>
<String>Herong
Yang</String>
<String>Herong
Yang</String>
<!-- 1 invalid "string" elements -->
<String> Hello <b>Herong</b>! </String>
</String_Datatype_Test>
Compile and run XsdSchemaValidator.java.
You see 1 error on the invalid "string" element:
c:\Progra~1\Java\jdk1.7.0_07\bin\java.exe XsdSchemaValidator
string_datatype_test.xsd string_datatype_test.xml
Error:
Line number: 19
Column number: 42
Message: cvc-type.3.1.2: Element 'String' is a simple type, so it
must have no element information item [children].
Failed with errors: 1
You can modify this example to try other "string" lexical representations and values.