Trying to parse XML from string into Python

Question

So first the string

'<?xml version="1.0" encoding="UTF-8"?><metalink version="3.0" xmlns="http://www.metalinker.org/" xmlns:lcgdm="LCGDM:" generator="lcgdm-dav" pubdate="Fri, 11 Oct 2013 12:46:10 GMT"><files><file name="/lhcb/L"><size>173272912</size><resources><url type="https">https://test-kit.test.de:2880/pnfs/test.file</url><url type="https">https://test.grid.sara.nl:2882/pnfs/test.file</url></resources></file></files></metalink>'

What I want to extract is the url text. Following code works but has flaws because it's hard coded:

root = ET.fromstring( xml_string )
for entry in root[0][0][1].iter():
  print entry.text

So this only works if the xml structure is the same. I tried to use xpath but I never got it working or with tags. I never got any results.

Is it a problem with the format of the xml string or am I doing something wrong?

Anand S Kumar · Accepted Answer · 2015-06-30 15:19:42Z

3

You can use xpath (and findall function of Node) to get the urls , but since you have used xmlns="http://www.metalinker.org/" for the root element, you will need to use that xmlns in the xpath as well.

Example -

>>> root = fromstring(xml_string)
>>> urls = root.findall('.//{http://www.metalinker.org/}url')
>>> for url in urls:
...     print(url.text)
...
https://test-kit.test.de:2880/pnfs/test.file
https://test.grid.sara.nl:2882/pnfs/test.file

The above xpath will find all urls in the xml.

answered Jun 30, 2015 at 15:19

Anand S Kumar

91.4k18 gold badges196 silver badges179 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Tupteq Over a year ago

Damn you, you were 30 seconds faster :)

Philipp Over a year ago

Thanks to both to you!

Tupteq · Accepted Answer · 2015-06-30 15:20:49Z

3

You used namespaces, so you need to use them in XPath:

for entry in root.findall('.//{http://www.metalinker.org/}url'):
    print entry.text

answered Jun 30, 2015 at 15:20

Tupteq

3,1141 gold badge23 silver badges35 bronze badges

Collectives™ on Stack Overflow

Trying to parse XML from string into Python

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related