1

In regex when writing <title>(.+?)</title> it is working but when this title tag is change to <table>(.+?)</table> it gives '[]' (square brackets) as output. My code is :

import urllib
import re

urls = ["http://physics.iitd.ac.in/content/list-faculty-members", "http://www.iitkgp.ac.in/commdir3/list.php?division=3&deptcode=ME","http://www.iitkgp.ac.in/commdir3/list.php?division=3&deptcode=CE"]
i = 0
regex = '<table>(.+?)</table>'
pattern = re.compile(regex)

while i< len(urls):
    htmlfile = urllib.urlopen(urls[i])
    htmltext = htmlfile.read()
    tables  = re.findall(pattern,htmltext)

    print tables
    i+=1

1 Answer 1

1

Use BeautifulSoup:

import urllib
import re

from BeautifulSoup import BeautifulSoup as bs

urls = ["http://physics.iitd.ac.in/content/list-faculty-members", 
        "http://www.iitkgp.ac.in/commdir3/list.php?division=3&deptcode=ME", 
        "http://www.iitkgp.ac.in/commdir3/list.php?division=3&deptcode=CE"]
i = 0

while i < len(urls):
    htmlfile = urllib.urlopen(urls[i])
    htmltext = htmlfile.read()
    soup = bs(htmltext)
    tables = soup.find_all('table')

    print tables
    i+=1
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you it is working perfectly. :) One more question: What if i want the downloaded data from table should be displayed as in table structure only, can that happen?
Can you elaborate on what you need?
yeah, what i want is the above link consists of professors data in table form and on running this code i get all their details but it cannot be understood by someone who don't know this. So i want if it could be downloaded in the same table format on cmd?
Although i have posted this as another question with code that i have tried and the code you have helped with. May i send you the link of question? @bernie
Sure you may. I will answer that one

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.