|
XML Syntax
Part:
1
2
(Continued from previous part...)
XML File Syntax Rules
Syntax rules of an XML file:
- Two XML elements can be nested by including one element as part of the content
of the other element.
- There must be only one element, called root element, that is not nested
inside any other elements.
- There must be one "xml" processing instruction before the root element.
- Values of element attributes must be enclosed within two double quotes.
- Element's tag and attribute's name must be made of alphanumeric characters
and 4 additional characters: "-", "_", ":", and ".".
- "instructions" in a proceesing instruction must be written in the syntax as
the attributes of an element.
That's it. As I said earlier, XML syntax is very simple.
The "xml" Processing Instruction
Every XML file must contain one "xml" processing instruction at the beginning of
the file to declare that this file is a XML file.
There is one required attribute for the "xml" processing instruction: version="1.0",
indicating the version number of XML. In this note, we are learning XML version 1.0.
dictionary.xml - A Simple XML File
Here is a simple XML file that represents a glossary with only two words defined:
<?xml version="1.0"?>
<!-- dictionary.xml
Copyright (c) 2002 by Dr. Herong Yang
-->
<dictionary>
<word acronym="true">
<name>XML</name>
<definition referenece="Herong's Notes">eXtensible Markup
Language.</definition>
<update date="2002-12-23"/>
</word>
<word symbol="true">
<name></name>
<definition>Mathematical symbol representing the "less than" logical
operation, like: 1<2.</definition>
<definition>Reserved symbol in XML to representing the beginning of
tags, like: <![CDATA[<p>Hello world!</p>]]>
</definition>
</word>
</dictionary>
Note that:
- A multiple-line comment is used to show the copyright information.
- "dictionary" is the root element.
- Attributes are used in elements: "word", "definition" and "update".
- "update" is an empty element with no content.
- "word" is a nested element, and repeated twice.
- Entity "'" is used in attribute "reference".
- Entity "<" is used in contents of elements "name" and "definition".
- CDATA section is used in the second "definition" of the second "word", in
which "<p>" and "</p>" will not be considered as XML tags any more.
Since XML don't care about how the tags should be named and what information
they should be carrying, we can re-organize the same information in many ways.
For example:
- We could move the information from the "date" attribute of "update" element
into the content, and rewrite the "update" like:
<update>2002-12-23</update>
- We could also move content into an attribute, like:
<name is_acronym="true" value="XML"/>
The choice is total up to the providers and consumers of the information to define
an agreed structure.
Part:
1
2
|