How do I scrape data generated with javascript using BeautifulSoup?

Question

I'm trying to migrate some comments from a blog using web scraping with python and BeautifulSoup. The content I'm looking for isn't in the HTML itself and seems to have been generated in a script tag (which I can't find). I've seen some answers regarding this but most of them are specific to a certain problem and I can't seem to figure out how to apply it to my site. I'm just trying to scrape comments from pages like this one:

http://www.themasterpiececards.com/famous-paintings-reviewed/bid/92327/famous-paintings-duccio-s-maesta

I've also tried Selenium, but I'm using a Cloud9-based IDE currently and it doesn't seem to support web drivers.

I apologize if I botched any of the lingo, I'm pretty new to programming. If anyone has any tips, that would be helpful. Thanks!

Using Selenium is your best bet. I don't about your IDE but I recommend you to change your IDE to Pycharm or something where the drivers are supported. — Keyur Potdar
– Keyur Potdar, Commented Jan 23, 2018 at 4:47

Gaur93 · Accepted Answer · 2018-01-23 11:04:15Z

1

You have many ways to scrap such content. One would be to find out how comments are loaded on this website. On quick lookup in chromium developer tools, comments for the page mentioned are loaded via this api call.

This may not be a suitable way for you as you may not generate this url for every different page.

Another more reliable way would be to render such js content using GUIless browser, for ease of implementation i would suggest using scrapy with splash .Splash is a python framework which renders most of the content for your requests.

answered Jan 23, 2018 at 11:04

Gaur93

6957 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How do I scrape data generated with javascript using BeautifulSoup?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related