0

I have a defaultdict(list).. So, data structure of following format:

1:[1,2,3,4,5]
2:[2,3,4]

I want to generate the following xml

<html>
<page>
<src>1</src>
<links>
   <link>1</link>
   <link>2</link>
    ...
    <link>5</link>
</links>
</page>

<page>
<src>2</src>
<links>
   <link>2</link>
   <link>3</link>
    <link>4</link>
</links>
</page>
<html>

And then write an indented xml to file

4
  • are the xml tags fixed? looks more like a html. Commented Apr 7, 2014 at 19:13
  • Check out this answer: http://stackoverflow.com/a/4470210/25097. lxml.builder.E is super-easy to use for this kind of thing. Commented Apr 7, 2014 at 19:15
  • @unixer: yepp they are fixed Commented Apr 7, 2014 at 19:15
  • Speaking as someone who does not know how to use the xml libraries of python, your question seems straightforward to implement with two nested for loops, if you are reasonable sure your scope won't grow for this functionality. Commented Apr 7, 2014 at 19:18

2 Answers 2

1

You can use BeautifulSoup:

from bs4 import Tag


d = {1: [1,2,3,4,5], 2: [2,3,4]}

root = Tag(name='html')
for key, values in d.iteritems():
    page = Tag(name='page')
    src = Tag(name='src')
    src.string = str(key)
    page.append(src)

    links = Tag(name='links')
    for value in values:
        link = Tag(name='link')
        link.string = str(value)
        links.append(link)

    page.append(links)
    root.append(page)

print root.prettify()

prints:

<html>
 <page>
  <src>
   1
  </src>
  <links>
   <link>
    1
   </link>
   <link>
    2
   </link>
   <link>
    3
   </link>
   <link>
    4
   </link>
   <link>
    5
   </link>
  </links>
 </page>
 <page>
  <src>
   2
  </src>
  <links>
   <link>
    2
   </link>
   <link>
    3
   </link>
   <link>
    4
   </link>
  </links>
 </page>
</html>
Sign up to request clarification or add additional context in comments.

2 Comments

Anyway, we can indent the format properly like <src>1</src> in same line??
@Fraz it's just a beautifulsoup prettifier, you can customize the output.
0

You can also define a jinja2 template and render it:

from jinja2 import Template


data = {1:[1,2,3,4,5], 2:[2,3,4]}

html = """<html>
    {% for key, values in data.iteritems() %}
        <page>
        <src>{{ key }}</src>
        <links>
            {% for value in values %}
               <link>{{ value }}</link>
            {% endfor %}
        </links>
        </page>
    {% endfor %}
<html>"""

template = Template(html)
print template.render(data=data)

prints:

<html>
        <page>
        <src>1</src>
        <links>
               <link>1</link>
               <link>2</link>
               <link>3</link>
               <link>4</link>
               <link>5</link>
        </links>
        </page>

        <page>
        <src>2</src>
        <links>
               <link>2</link>
               <link>3</link>
               <link>4</link>
        </links>
        </page>
<html>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.