Herong's Tutorial Notes On XML Technologies
Dr. Herong Yang, Version 3.04

XML Syntax

Part:   1   2 

This tutorial describes:

  • Types of Information in XML Files
  • XML File Syntax Rules
  • The "xml" Processing Instruction
  • dictionary.xml - A Simple XML File

Types of Information in XML Files

There 6 types of information in an XML file:

1. Processing Instruction: Used to pass an instruction to applications that processing this file. Processing instructions are written in the following syntax:

<?target instruction?>

where "target" is the name of a target group of applications expected to use this instruction, and "instruction" is the actual instruction to be passed to those applications.

2. Comments: Used only to comment the XML file. Comments will be ignored by application that processing this file. Comments are written in the following syntax:

<!--comment-->

where "comment" is the text of comment.

3. XML Elements: Used to present a unit of information, with a name, a optional body, and optional attributes. Elements are written in the following syntaxes:

<tag/>
<tag attributes/>
<tag>content</tag>
<tag attributes>content</tag>

where "tag" is the name of the element, "content" is a string of text, or text mixed with XML elements, and "attributes" is one pair of name and value or a list of multiple pairs of name and value written in the following syntax:

name="value"
name_1="value 1" name_2="value 2" ... name_n="value n"

4. Mixed Text: A string of text, or text mixed with XML elements, used as contents of elements. Examples of mixed text:

This mixed text only contains characters.
This mixed text <br/>contains characters and <b s="1">elements</b>.
This mixed text contains entities, &amp;, &lt;</b>.

5. XML Entities: Special escape sequences to represent XML reserved characters. XML entities can be used in element content and attribute values. These are the XML pre-defined entities and the reserved characters they are representing:

Entity   Character
&amp;    &
&apos;   '
&gt;     >
&lt;     <
&quote;  "

5. CDATA Section: A section of text in which any XML reserved characters should be treated as normal characters. CDATA sections are written in the following syntax:

<![CDATA[
text line
...
]]>

(Continued on next part...)

Part:   1   2 

Dr. Herong Yang, updated in 2006
Herong's Tutorial Notes On XML Technologies - XML Syntax