XML Tutorials - Herong's Tutorial Examples - v5.25, by Herong Yang
End-of-Line Characters Supported in XML 1.1
This section provides a tutorial example showing that two more end-of-line characters, #x85 and #x2028, are supported in XML 1.1.
Next, let's prove that end-of-line characters are handled differently in XML 1.1 than XML 1.0. Since end-of-line characters are not easy to be presented as text files, I created this test program, EndOfLineXml.java:
/* EndOfLineXml.java * Copyright (c) 2002-2018 HerongYang.com. All Rights Reserved. */ import java.io.*; import java.math.*; import javax.xml.parsers.*; import org.w3c.dom.*; class EndOfLineXml { public static void main(String[] args) { try { String ver = args[0]; char u000D = 0x000D; char u000A = 0x000A; char u0085 = 0x0085; char u2028 = 0x2028; String xmlString = "<?xml version=\""+ver+"\" encoding=\"UTF-16BE\"?><ul>" +"<li>1111"+u000D+u000A+"1111</li>" +"<li>2222"+u000D+u0085+"2222</li>" +"<li>3333"+u0085+"03333</li>" +"<li>4444"+u2028+"04444</li>" +"<li>5555"+u000D+"05555</li>" +"<li>6666"+u000A+"06666</li>" +"</ul>"; File xmlFile = new File(args[1]); FileOutputStream fos = new FileOutputStream(xmlFile); OutputStreamWriter osw = new OutputStreamWriter(fos,"UTF-16BE"); osw.write(xmlString); osw.close(); DocumentBuilderFactory fct = DocumentBuilderFactory.newInstance(); DocumentBuilder bld = fct.newDocumentBuilder(); Document doc = bld.parse(xmlFile); dumpNode(doc, ""); } catch (Exception e) { System.out.println(e.toString()); } } static void dumpNode(Node n, String p) throws Exception { NodeList l = n.getChildNodes(); NamedNodeMap m = n.getAttributes(); int ml = -1; if (m!=null) ml = m.getLength(); System.out.println(p+n.getNodeName()+": "+n.getNodeType()+", " +l.getLength()+", "+ml+", "+str10ToHex(n.getNodeValue())); for (int i=0; i<ml; i++) { Node c = m.item(i); dumpNode(c,p+" |-"); } for (int i=0; i<l.getLength(); i++) { Node c = l.item(i); dumpNode(c,p+" "); } } static String str10ToHex(String str) throws Exception { if (str!=null) { return String.format("%040X", new BigInteger(1,str.getBytes("UTF-16BE"))); } else { return "NULL"; } } }
Some notes on EndOfLineXml.java:
Let's try XML 1.0 first with JDK 10 and 1.8:
herong> java EndOfLineXml 1.0 end-of-line-1-0.xml #document: 9, 1, -1, NULL ul: 1, 6, 0, NULL li: 1, 1, 0, NULL #text: 3, 0, -1, 00000031003100310031000A0031003100310031 li: 1, 1, 0, NULL #text: 3, 0, -1, 0032003200320032000A00850032003200320032 li: 1, 1, 0, NULL #text: 3, 0, -1, 0033003300330033008500300033003300330033 li: 1, 1, 0, NULL #text: 3, 0, -1, 0034003400340034202800300034003400340034 li: 1, 1, 0, NULL #text: 3, 0, -1, 0035003500350035000A00300035003500350035 li: 1, 1, 0, NULL #text: 3, 0, -1, 0036003600360036000A00300036003600360036
The output confirms that end-of-line characters are handled in XML 1.0 as below:
Here is the output of XML 1.1:
herong> java EndOfLineXml 1.1 end-of-line-1-1.xml #document: 9, 1, -1, NULL ul: 1, 6, 0, NULL li: 1, 1, 0, NULL #text: 3, 0, -1, 00000031003100310031000A0031003100310031 li: 1, 1, 0, NULL #text: 3, 0, -1, 00000032003200320032000A0032003200320032 li: 1, 1, 0, NULL #text: 3, 0, -1, 0033003300330033000A00300033003300330033 li: 1, 1, 0, NULL #text: 3, 0, -1, 0034003400340034000A00300034003400340034 li: 1, 1, 0, NULL #text: 3, 0, -1, 0035003500350035000A00300035003500350035 li: 1, 1, 0, NULL #text: 3, 0, -1, 0036003600360036000A00300036003600360036
The output confirms that end-of-line characters are handled in XML 1.1 as below:
Note that because end-of-line characters are automatically replaced, you may run into trouble if you really want keep them as part of a node value.
Table of Contents
Introduction of XML (eXtensible Markup Language)
DOM (Document Object Model) Programming Interface
SAX (Simple API for XML) Programming Interface
DTD (Document Type Definition) Introduction
Validating an XML Document against the Specified DTD Document Type
XSD (XML Schema Definition) Introduction
Validating XML Documents Against Specified XML Schemas
XSL (Extensible Stylesheet Language) Introduction
XSLT (XSL Transformations) Introduction
XSLT Elements as Programming Statements
Control and Generate XML Element in the Result
PHP Extensions for XML Manipulation
Processing XML with Python Scripts
XML Tools Plugin for Notepad++
XML Plugin Packages for Atom Editor
►XML 1.1 Changes and Parsing Examples
Supporting XML 1.1 in Java and Higher
Control Codes Supported in XML 1.1
Unicode Characters Supported in XML 1.1 Names
►End-of-Line Characters Supported in XML 1.1