Herong's Tutorial Notes On XML Technologies
Dr. Herong Yang, Version 3.04

Document Object Model (DOM)

Part:   1  2   3 

(Continued from previous part...)

The following program illustrates how an XML file can be parse into a DOM document tree, and how the get methods of Node can be used to browse the tree:

/**
 * DOMBrowser.java
 * Copyright (c) 2002 by Dr. Herong Yang
 */
import java.io.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.*;
class DOMBrowser {
   public static void main(String[] args) {
      try {
      	 File x = new File(args[0]);
         DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
         DocumentBuilder b = f.newDocumentBuilder();
         Document d = b.parse(x);
         printNode(d, "");
      } catch (ParserConfigurationException e) {
         System.out.println(e.toString()); 	
      } catch (SAXException e) {
         System.out.println(e.toString()); 	
      } catch (IOException e) {
         System.out.println(e.toString()); 	
      }
   }
   static void printNode(Node n, String p) {
      NodeList l = n.getChildNodes();
      NamedNodeMap m = n.getAttributes();
      int ml = -1;
      if (m!=null) ml = m.getLength(); 
      System.out.println(p+n.getNodeName()+": "+n.getNodeType()+", "
         +l.getLength()+", "+ml+", "+n.getNodeValue());
      for (int i=0; i<ml; i++) {
         Node c = m.item(i);
         printNode(c,p+" |-");
      }
      for (int i=0; i<l.getLength(); i++) {
         Node c = l.item(i);
         printNode(c,p+" ");
      }
   }
}

Now let's use this program to browse my first XML file, hello.xml:

<?xml version="1.0"?>
<body>Hello world!</body>
you will get the following output:
#document: 9, 1, -1, null
 body: 1, 1, 0, null
  #text: 3, 0, -1, Hello world!

Here is how to read the output:

  • The Document object is also a Node object, which is presented by the first line in the output.
  • The "xml" processing instruction is not part of the document object.
  • The second line in the output says that the root element is named as "body", of type 1, has 1 child node, has 0 attribute, and has no value.
  • The third line in the output says that there is child node nested inside the "body" node. The child node is called "#text", of type 3, has 0 child node, could not have any attribute, and has a value of string "Hello world!".
  • Note that the text enclosed by the "body" tags is parsed into a node separated from the "body" node. So how can we link that text with the tag name "body"?

Here is another XML file with more elements, user.xml:

<?xml version="1.0"?>
<user status="active">
 <!-- This is not a real user. -->
 <first_name>John</first_name>
 <last_name>Smith</last_name>
</user>

Run DOMBrowser with this XML file, you will get:

#document: 9, 1, -1, null
 user: 1, 7, 1, null
  |-status: 2, 0, -1, active
  #text: 3, 0, -1,

  #comment: 8, 0, -1,  This is not a real user.
  #text: 3, 0, -1,

  first_name: 1, 1, 0, null
   #text: 3, 0, -1, John
  #text: 3, 0, -1,

  last_name: 1, 1, 0, null
   #text: 3, 0, -1, Smith
  #text: 3, 0, -1,

The output is more interesting:

  • Line breaks are also parsed into "#text" nodes. This is why node "user" has 7 child nodes: 4 line breaks, 1 comment, and 2 elements: "first_name" and "last_name".
  • For a node that represents an attribute of element, the node value is the attribute value. See node "status" under "user".

Building a New DOM Document

DOM documents can also be created from scratch instead of parsed from an XML file. This can be easily done by using the following methods:

  • DocumentBuilder.newDocument(): Create an empty DOM document.
  • Document.createElement(): Create a new Element.
  • Document.createTextNode(): Create a new TextNode.
  • Node.appendChild(): Insert a sub Node into the current Node.
  • Element.setAttribute(): Add new attribute into an element.

The following program shows how to create a simple DOM document with these methods:

/**
 * DOMNewDoc.java
 * Copyright (c) 2002 by Dr. Herong Yang
 */
import java.io.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
class DOMNewDoc {
   public static void main(String[] args) {
      try {
         DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
         DocumentBuilder b = f.newDocumentBuilder();
         Document d = b.newDocument();
         Element r = d.createElement("dictionary"); 
         d.appendChild(r);
         Element w = d.createElement("word");
         r.appendChild(w);
         Element e = d.createElement("update");
         w.appendChild(e);
         e.setAttribute("date","2002-12-24");
         e = d.createElement("name");
         w.appendChild(e);
         e.setAttribute("is_acronym","true");
         e.appendChild(d.createTextNode("DTD"));
         e = d.createElement("definition");
         w.appendChild(e);
         e.appendChild(d.createTextNode("Document Type Definition"));
         printNode(d, "");
      } catch (ParserConfigurationException e) {
         System.out.println(e.toString()); 	
      }
   }
   static void printNode(Node n, String p) {
      NodeList l = n.getChildNodes();
      NamedNodeMap m = n.getAttributes();
      int ml = -1;
      if (m!=null) ml = m.getLength(); 
      System.out.println(p+n.getNodeName()+": "+n.getNodeType()+", "
         +l.getLength()+", "+ml+", "+n.getNodeValue());
      for (int i=0; i<ml; i++) {
         Node c = m.item(i);
         printNode(c,p+" |-");
      }
      for (int i=0; i<l.getLength(); i++) {
         Node c = l.item(i);
         printNode(c,p+" ");
      }
   }
}

(Continued on next part...)

Part:   1  2   3 

Dr. Herong Yang, updated in 2006
Herong's Tutorial Notes On XML Technologies - Document Object Model (DOM)