Herong's Tutorial Notes On XML Technologies
Dr. Herong Yang, Version 3.04

Document Object Model (DOM)

Part:   1   2  3 

This tutorial describes:

  • What is Document Object Model (DOM)
  • Parsing XML Files with DOM
  • The DOM Tree Structure
  • Building a New DOM Document
  • Converting DOM Documents to XML Files

What is Document Object Model (DOM)

DOM: An Application Programming Interface (API) that represents an XML file as a document object, which allows application programs to manage the information contained in the document object.

DOM has been implemented in Java in J2SDK 1.4.1_01, which is already installed on my system. So I am ready to play with XML files through DOM in Java.

Parsing XML Files with DOM

Here is a program to show how different packages are used together to parse an XML file into a DOM document object:

/**
 * DOMParser.java
 * Copyright (c) 2002 by Dr. Herong Yang
 */
import java.io.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.*;
class DOMParser {
   public static void main(String[] args) {
      try {
      	 File x = new File(args[0]);
         DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
         System.out.println(f.toString()); 	
         DocumentBuilder b = f.newDocumentBuilder();
         System.out.println(b.toString()); 	
         Document d = b.parse(x);
         System.out.println(d.toString()); 	
         DOMImplementation i = d.getImplementation();
         System.out.println(i.toString());
      } catch (ParserConfigurationException e) {
         System.out.println(e.toString()); 	
      } catch (SAXException e) {
         System.out.println(e.toString()); 	
      } catch (IOException e) {
         System.out.println(e.toString()); 	
      }
   }
}

Output:

org.apache.crimson.jaxp.DocumentBuilderFactoryImpl@1c78e57
org.apache.crimson.jaxp.DocumentBuilderImpl@13e8d89
org.apache.crimson.tree.XmlDocument@1cfb549
org.apache.crimson.tree.DOMImplementationImpl@1820dda

Note that:

  • javax.xml.parsers.DocumentBuilderFactory.newInstance() method is used to create a new fatory instance using a factory implementation from the org.apache.crimson.jaxp.* package.
  • javax.xml.parsers.DocumentBuilder.newDocumentBuilder() method is used to create a new builder instance using a builder implementation from org.apache.crimson.jaxp.* package.
  • javax.xml.parsers.DocumentBuilder.parse() method is used to parse the XML file into an org.w3c.dom.Document object implemented with org.apache.crimson.tree.XmlDocument class.

The DOM Tree Structure

In DOM, an XML file is represented with a tree structure, called "document". Every piece of information in an XML file is abstracted as a Node object, and represented by a node in the tree.

Node is actually an interface. It is implemented into many DOM classes to represent different types of information in an XML file. Features that are common to DOM classes are defined as methods in the Node interface. Major get methods of Node include:

  • getNodeType(): Returns the node type.
  • getNodeName(): Returns the node name.
  • getNodeValue(): Returns the value associated with this node.
  • getChildNodes(): Returns a list of nodes nested inside this node.
  • getAttributes(): Returns a list of nodes that represents the attributes of this node.

Here is is a list of node types that are supported by DOM:

 2 ATTRIBUTE_NODE
 4 CDATA_SECTION_NODE
 8 COMMENT_NODE
11 DOCUMENT_FRAGMENT_NODE
 9 DOCUMENT_NODE
10 DOCUMENT_TYPE_NODE
 1 ELEMENT_NODE
 6 ENTITY_NODE
 5 ENTITY_REFERENCE_NODE
12 NOTATION_NODE
 7 PROCESSING_INSTRUCTION_NODE
 3 TEXT_NODE

(Continued on next part...)

Part:   1   2  3 

Dr. Herong Yang, updated in 2006
Herong's Tutorial Notes On XML Technologies - Document Object Model (DOM)