2

I'm working on a python program which would scrape data (public data) from web pages. The problem is when I want to get source code of a web page which is accessable using button and it's based on ASP.NET. I can't just parse a href from the page as usual.

So my question is: Is there a simple way how to get the source code of the ASP.NET page?

To explain it clearly I'm attaching one web page based on ASP.NET: In this case I want to get soure code of page which is displayed when I click on "Radiátor topení (1)" in the middle of the page. You can see the parent page where is the button on which I want to simulate click here!

I was trying to check source code of this (parent) page and look for some url near the "Radiátor topení (1)" text but I've found only this:

<td class="CatalogCell"><a onclick=" return PathClick(&#39;3761801;176564;356239;922141;922488;922507;922508&#39;)"><H2 class="CatalogH">Radiátor topení (1)</H2></a></td> and I'm afraid, this wont help me.

I'm looking for a simpliest way because I'm not expert in ASP.NET nor Javascript. Thanks for advices!

1 Answer 1

1

The program is in python, which gives the html source of the link.

import urllib2
from bs4 import BeautifulSoup

link="http://www.example.com"
hdr = {'User-Agent': 'Mozilla/5.0'}
req = urllib2.Request(link,headers=hdr)

page = urllib2.urlopen(link)
soup = BeautifulSoup(page,'html.parser')

print soup
Sign up to request clarification or add additional context in comments.

2 Comments

for a in soup.find_all('a', href=True): print "Found the URL:", a['href']
Thanks, but as I wrote, there is no href in the source code of the parent page... so I can't just look on the source code and find a path to page (in this case it is "Radiátor topení (1)")

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.