xml.sax for Parsing XML Document

This section provides a tutorial example on how to parse an existing XML document with SAX event handlers with Python xml.sax package.

xml.sax sub-package offers the following functionalities to parse an XML document and access its elements/attributes through event handler functions.

Here is an example Python script that parses an XML document with my own SAX events handler class using the xml.sax sub-package:

#- xml_sax_Parser.py
#- Copyright (c) 2018 HerongYang.com. All Rights Reserved.
#
import xml.sax

#- My SAX event handler class
class myHandler(xml.sax.ContentHandler):
    def __init__(self):
        self.pad = "_"

    def startDocument(self):
        print("startDocument() called")
    def endDocument(self):
        print("endDocument() called")

    def startElement(self, name, attrs):
        att_str = ""
        for (key, value) in attrs.items(): 
            att_str += key+"="+value+", "
        print("e"+self.pad+name+": "+att_str)
        self.pad += "_"

    def endElement(self, name):
        self.pad = self.pad[:-1]

    def characters(self, content):
        txt_str = content.strip()
        if len(txt_str) > 0:
            print("c"+self.pad+content)

sax = xml.sax.make_parser()
sax.setContentHandler(myHandler())
sax.setFeature(xml.sax.handler.feature_namespaces, 0)

sax.parse("dictionary.xml")

Use the program from the last tutorial minidom_Build_XML.py to create an XML file. Then parse it with xml_sax_Parser.py:

herong> python minidom_Build_XML.py > dictionary.xml 

herong> python xml_sax_Parser.py

startDocument() called
e_dictionary: 
e__word: 
e___update: date=2100-01-01, 
e___name: is_acronym=true, 
c____DTD
e___definition: 
c____Document Type Definition
endDocument() called

Notes on the Python script and output:

Table of Contents

 About This Book

 Introduction of XML (eXtensible Markup Language)

 XML File Syntax

 XML File Browsers

 XML-JSON Document Conversion

 DOM (Document Object Model) Programming Interface

 SAX (Simple API for XML) Programming Interface

 DTD (Document Type Definition) Introduction

 Syntaxes of DTD Statements

 Validating an XML Document against the Specified DTD Document Type

 XSD (XML Schema Definition) Introduction

 Syntaxes of XSD Statements

 Validating XML Documents Against Specified XML Schemas

 XSL (Extensible Stylesheet Language) Introduction

 Java Implementation of XSLT

 XSLT (XSL Transformations) Introduction

 XPath (XML Path) Language

 XSLT Elements as Programming Statements

 Control and Generate XML Element in the Result

 PHP Extensions for XML Manipulation

Processing XML with Python Scripts

 What Is the Python "xml" Package

 xml.dom.minidom for Building XML Document

 xml.dom.minidom for Parsing XML Document

xml.sax for Parsing XML Document

 XML Notepad - XML Editor

 XML Tools Plugin for Notepad++

 XML Plugin Packages for Atom Editor

 XML 1.1 Changes and Parsing Examples

 Archived Tutorials

 References

 Full Version in PDF/EPUB