Parsing CDATA in xml with python

Question

I need to parse an XML file with a number of blocks of CDATA that I need to retain for later plotting:

<process id="process1"> <log name="name1" device="device1"><![CDATA[timestamp value]]]></log> <log name="name2" device="device2"><![CDATA[timestamp value, timestamp value, timestamp]]]></log> </process>

I will need to do this repeatedly and quickly, and I am looking for the best way to do this. I've read that ElementTree is the faster of the methods, but I am open to other suggestions.

xtree is another alternate for your problem better than element tree. — Rajendra
– Rajendra, Commented Dec 4, 2012 at 4:17

Joe · Accepted Answer · 2013-01-21 03:22:55Z

16

Here are two examples of how to do it:

from lxml import etree
import xml.etree.ElementTree as ElementTree

CONTENT = """
<process id="process1">
 <log name="name1" device="device1"><![CDATA[timestamp value]]></log>
 <log name="name2" device="device2"><![CDATA[timestamp value, timestamp value, timestamp]]></log>
</process>
"""

def parse_with_lxml():
    root = etree.fromstring(CONTENT)
    for log in root.xpath("//log"):
        print log.text

def parse_with_stdlib():
    root = ElementTree.fromstring(CONTENT)
    for log in root.iter('log'):
        print log.text

if __name__ == '__main__':
    parse_with_lxml()
    parse_with_stdlib()

Output:

timestamp value
timestamp value, timestamp value, timestamp
timestamp value
timestamp value, timestamp value, timestamp

The text attribute it handles it in both cases.

answered Jan 21, 2013 at 3:22

Joe

3,0592 gold badges24 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

jfs Over a year ago

For performance, cElementTree could be used (note: leadind c)

Collectives™ on Stack Overflow

Parsing CDATA in xml with python

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related