This section provides a tutorial example on how to write an XML file browser, SAXBrowser.java. This browser implements some event handler methods provided by the SAX interface.
Let's build a simple SAX based XML browser by handling the events in the ContentHandler
interface:
/**
* SAXBrowser.java
* Copyright (c) 2002 by Dr. Herong Yang
*/
import java.io.File;
import java.io.IOException;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.ParserConfigurationException;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
class SAXBrowser {
public static void main(String[] args) {
try {
File x = new File(args[0]);
SAXParserFactory f = SAXParserFactory.newInstance();
SAXParser p = f.newSAXParser();
DefaultHandler h = new MyContentHandler();
p.parse(x,h);
} catch (ParserConfigurationException e) {
System.out.println(e.toString());
} catch (SAXException e) {
System.out.println(e.toString());
} catch (IOException e) {
System.out.println(e.toString());
}
}
private static class MyContentHandler extends DefaultHandler {
static String p = "_";
public void startDocument() throws SAXException {
System.out.println("Starting document...");
}
public void endDocument() throws SAXException {
System.out.println("Ending document...");
}
public void startElement(String ns, String sName, String qName,
Attributes attrs) throws SAXException {
String eName = sName;
if (sName.equals("")) eName = qName;
System.out.println("e"+p+eName);
if (attrs!=null) {
for (int i=0; i<attrs.getLength(); i++) {
String aName = attrs.getLocalName(i);
if (aName.equals("")) aName = attrs.getQName(i);
System.out.println("a"+p+" "+aName+"="
+attrs.getValue(i));
}
}
p = p + "_";
}
public void endElement(String ns, String sName, String qName)
throws SAXException {
p = p.replaceFirst("__", "_");
}
public void characters(char buf[], int offset, int len)
throws SAXException {
String s = new String(buf, offset, len);
System.out.println("c"+p+s);
}
public void ignorableWhitespace(char buf[], int offset, int len)
throws SAXException {
String s = new String(buf, offset, len);
System.out.println("i"+p+s);
}
}
}
Note that:
I cheated a little bit. Instead of implementing the ContentHandler interface
directly, I extended the DefaultHandler class, which implements handling methods
for all events (by doing nothing). In this way, I only need to override the handling
methods that I am interested in.
"_" character is used to indent sub-elements in nested elements.
The program still works. But why the parser fired so many "characters()" events?
It looks like the parser didn't group the space character, line feed, and cartridge
return into a single char[] and fire one "characters()" event. It fired multiple
events, one per character.