JDK (Java Development Kit) Tutorials
Dr. Herong Yang, Version 5.00

DOMBrowser.java - Browsing DOM Tree Structure

This section provides a tutorial example on how to write a DOM object browser, DOMBrowser.java, to browse through the DOM object tree structure and print the content at each tree node.

In DOM, an XML file is represented with a tree structure, called "document". Every piece of information in an XML file is abstracted as an org.w3c.dom.Node object, and represented by a node in the tree.

"Node" is actually an interface. It is implemented into many DOM classes to represent different types of information in an XML file. Features that are common to DOM classes are defined as methods in the Node interface. Major get methods of Node include:

  • getNodeType(): Returns the node type.
  • getNodeName(): Returns the node name.
  • getNodeValue(): Returns the value associated with this node.
  • getChildNodes(): Returns a list of nodes nested inside this node.
  • getAttributes(): Returns a list of nodes that represents the attributes of this node.

Here is is a list of node types that are supported by DOM:

 2 ATTRIBUTE_NODE
 4 CDATA_SECTION_NODE
 8 COMMENT_NODE
11 DOCUMENT_FRAGMENT_NODE
 9 DOCUMENT_NODE
10 DOCUMENT_TYPE_NODE
 1 ELEMENT_NODE
 6 ENTITY_NODE
 5 ENTITY_REFERENCE_NODE
12 NOTATION_NODE
 7 PROCESSING_INSTRUCTION_NODE
 3 TEXT_NODE

The following program illustrates how an XML file can be parse into a DOM document tree, and how get methods of Node can be used to browse the tree:

/**
 * DOMBrowser.java
 * Copyright (c) 2002 by Dr. Herong Yang
 */
import java.io.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.*;
class DOMBrowser {
   public static void main(String[] args) {
      try {
      	 File x = new File(args[0]);
         DocumentBuilderFactory f 
            = DocumentBuilderFactory.newInstance();
         DocumentBuilder b = f.newDocumentBuilder();
         Document d = b.parse(x);
         printNode(d, "");
      } catch (ParserConfigurationException e) {
         System.out.println(e.toString()); 	
      } catch (SAXException e) {
         System.out.println(e.toString()); 	
      } catch (IOException e) {
         System.out.println(e.toString()); 	
      }
   }
   static void printNode(Node n, String p) {
      NodeList l = n.getChildNodes();
      NamedNodeMap m = n.getAttributes();
      int ml = -1;
      if (m!=null) ml = m.getLength(); 
      System.out.println(p+n.getNodeName()+": "+n.getNodeType()+", "
         +l.getLength()+", "+ml+", "+n.getNodeValue());
      for (int i=0; i<ml; i++) {
         Node c = m.item(i);
         printNode(c,p+" |-");
      }
      for (int i=0; i<l.getLength(); i++) {
         Node c = l.item(i);
         printNode(c,p+" ");
      }
   }
}

Now let's use this program to browse my first XML file, hello.xml:

<?xml version="1.0"?>
<body>Hello world!</body>

You will get the following output:

#document: 9, 1, -1, null
 body: 1, 1, 0, null
  #text: 3, 0, -1, Hello world!

Here is how to read the output:

  • The Document object is also a Node object, which is presented by the first line in the output.
  • The "xml" processing instruction is not part of the document object.
  • The second line in the output says that the root element is named as "body", of type 1, has 1 child node, has 0 attribute, and has no value.
  • The third line in the output says that there is child node nested inside the "body" node. The child node is called "#text", of type 3, has 0 child node, could not have any attribute, and has a value of string "Hello world!".
  • Note that the text enclosed by the "body" tags is parsed into a node separated from the "body" node. So how can we link that text with the tag name "body"?

Here is another XML file with more elements, user.xml:

<?xml version="1.0"?>
<user status="active">
 <!-- This is not a real user. -->
 <first_name>John</first_name>
 <last_name>Smith</last_name>
</user>

Run DOMBrowser with this XML file, you will get:

#document: 9, 1, -1, null
 user: 1, 7, 1, null
  |-status: 2, 0, -1, active
  #text: 3, 0, -1,

  #comment: 8, 0, -1,  This is not a real user.
  #text: 3, 0, -1,

  first_name: 1, 1, 0, null
   #text: 3, 0, -1, John
  #text: 3, 0, -1,

  last_name: 1, 1, 0, null
   #text: 3, 0, -1, Smith
  #text: 3, 0, -1,

The output is more interesting:

  • Line breaks are also parsed into "#text" nodes. This is why node "user" has 7 child nodes: 4 line breaks, 1 comment, and 2 elements: "first_name" and "last_name".
  • For a node that represents an attribute of element, the node value is the attribute value. See node "status" under "user".

Last update: 2006.

Table of Contents

 About This JDK Tutorial Book

 Downloading and Installing JDK 1.3.1 on Windows

 Downloading and Installing JDK 1.4.1 on Windows

 Downloading and Installing JDK 1.5.0 on Windows

 Downloading and Installing JDK 1.6.2 on Windows

 Date, Time and Calendar Classes

 Date and Time Object and String Conversion

 Number Object and Numeric String Conversion

 Locales, Localization Methods and Resource Bundles

 Calling and Importing Classes Defined in Unnamed Packages

 HashSet, Vector, HashMap and Collection Classes

 Character Set Encoding Classes and Methods

 Character Set Encoding Maps

 Encoding Conversion Programs for Encoded Text Files

 Socket Network Communication

 Datagram Network Communication

DOM (Document Object Model) - API for XML Files

 DOMParser.java - Parsing XML Files with DOM

DOMBrowser.java - Browsing DOM Tree Structure

 DOMNewDoc.java - Building a New DOM Document

 DOMToXML.java - Converting DOM Documents to XML Files

 SAX (Simple API for XML)

 DTD (Document Type Definition) - XML Validation

 XSD (XML Schema Definition) - XML Validation

 XSL (Extensible Stylesheet Language)

 Message Digest Algorithm Implementations in JDK

 Private key and Public Key Pair Generation

 PKCS#8/X.509 Private/Public Encoding Standards

 Digital Signature Algorithm and Sample Program

 "keytool" Commands and "keystore" Files

 KeyStore and Certificate Classes

 Secret Key Generation and Management

 Cipher - Secret Key Encryption and Decryption

 The SSL (Secure Socket Layer) Protocol

 SSL Socket Communication Testing Programs

 SSL Client Authentication

 HTTPS (Hypertext Transfer Protocol Secure)

 References

 PDF Printing Version

Dr. Herong Yang, updated in 2008
DOMBrowser.java - Browsing DOM Tree Structure