5
<html> 
<table border="1px"> 
<tr>
<td>yes</td>
<td>no</td>
</tr>
</table>
</html>

Is there any way to get the contents of the table (yes ,no) besides beautifulsoup??

A python beginner,any help or any kind of direction will be of great help.

Thank you

5
  • 1
    Yes there is. Should you do it without a parser? Probably not. Commented Jul 14, 2011 at 8:22
  • okay,how do i parse it ??.. any tutorial sites that you might suggest??... googling it dint give fruitful result.. Commented Jul 14, 2011 at 8:30
  • If the structure of your markup is relatively stable and you can guarantee it's well-formatted, you can try using regexes. (For example, one for enumerating table rows, the other for getting cells within a row). Commented Jul 14, 2011 at 8:31
  • 1
    @PHP: the reason people like BeautifulSoup is that it is very flexible in the HTML it accepts, which is useful since a lot of what you find on the internet is broken. Things like lxml and HTMLParser are rather stricter on what mistakes they allow. Commented Jul 14, 2011 at 8:35
  • @Xion : Will check out regexes. @katrielalex ,have been using beautifulsoup. Commented Jul 14, 2011 at 8:51

1 Answer 1

12

You can use the HTMLParser module that comes with the Python standard library.

>>> import HTMLParser
>>> data = '''
... <html> 
... <table border="1px"> 
... <tr>
... <td>yes</td>
... <td>no</td>
... </tr>
... </table>
... </html>
... '''
>>> class TableParser(HTMLParser.HTMLParser):
...     def __init__(self):
...         HTMLParser.HTMLParser.__init__(self)
...         self.in_td = False
...     
...     def handle_starttag(self, tag, attrs):
...         if tag == 'td':
...             self.in_td = True
...     
...     def handle_data(self, data):
...         if self.in_td:
...             print data
...     
...     def handle_endtag(self, tag):
...         self.in_td = False
... 
>>> p = TableParser()
>>> p.feed(data)
yes
no
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.