1

I need to output XML from python in a very minimalist way:

  • I can't use any external libraries beyond what's already in Python 2.6.5
  • I need to output XML tags, and text contents, with no attributes

At this point I'm using print statements to explicitly print out angle-bracket tags, and the only thing really stopping me is escaping text within a tag, which I do not know how to do.

Any suggestions?


update: Is there anything like Java's StAX XMLStreamWriter for Python? I may have a large XML document to produce, and I don't need (or want) to hold the entire document in memory.

update #2: I also need to escape random unicode or non-ASCII characters in the text besides <, > and &.

1
  • Why do you think that you need to escape "random" unicode or non-ASCII characters? Can't you just encode it in UTF-8? Commented Dec 22, 2010 at 21:03

4 Answers 4

4

Well, it looks like SAX isn't that hard to use after all. Here's an example.

xmltest.py:

import xml.sax.xmlreader
import xml.sax.saxutils

def testJunk(file, e2content):
  attr0 = xml.sax.xmlreader.AttributesImpl({})
  x =  xml.sax.saxutils.XMLGenerator(file)
  x.startDocument()
  x.startElement("document", attr0)

  x.startElement("element1", attr0)
  x.characters("bingo")
  x.endElement("element1")

  x.startElement("element2", attr0)
  x.characters(e2content)
  x.endElement("element2")

  x.endElement("document")
  x.endDocument()

tested:

>>> import xmltest
>>> xmltest.testJunk(open("test.xml","w"), "wham < 3!")

produces:

<?xml version="1.0" encoding="iso-8859-1"?>
<document><element1>bingo</element1><element2>wham &lt; 3!</element2></document>
Sign up to request clarification or add additional context in comments.

2 Comments

You need to use x = xml.sax.saxutils.XMLGenerator(file, "UTF-8") and you need to feed it unicode objects (or str objects that contain only ASCII) in a considered non-random fashion. If you have text in some legacy encoding e.g. cp1252, then decode it before stuffing it into x.characters()
Also for portability you should use "wb" in your file open() call.
2

If task is as simple, minidom may suffice. Here goes short example:

from xml.dom.minidom import Document

# create xml document
document = Document()

# create root element
root = document.createElement("root")
document.appendChild(root)

# create child element
child = document.createElement("child")
child.setAttribute("tag", "test")
root.appendChild(child)

# insert some text
atext = document.createTextNode("Foo bar")
child.appendChild(atext)

# print created xml
print(document.toprettyxml(indent="    "))

1 Comment

+1 for the quick example. I can't use it for my particular application, as I have a potentially large XML file to produce + don't need to hold/manipulate a DOM-style XML object in memory.
2

ElementTree comes with Python 2.6:

from xml.etree import ElementTree as ET
root = ET.Element('root')
sub = ET.SubElement(root,'sub')
sub.text = 'Hello & Goodbye'
tree = ET.ElementTree(root)
tree.write('out.xml')
# OR
ET.dump(root)

Output

<root><sub>Hello &amp; Goodbye</sub></root>

Comments

1

xml.sax.saxutils.escape(data[, entities]).

1 Comment

That only escapes <, >, & plus specific entities. If I have any random unicode or ASCII characters, I need to escape them too.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.