I have Python code that can parse data from a string variable containing HTML code.
I want code that gets the HTML from URL and then parses this data.
the working code (parsing HTML):
from bs4 import BeautifulSoup
data = '''\
<html>
<head>
<meta name="generator"
content="HTML Tidy for HTML5 (experimental) for Windows https://github.com/w3c/tidy-
html5/tree/c63cc39" />
<title></title>
</head>
<body>
<div class="Eqh F6l Jea k1A zI7 iyn Hsu">
<div class="Shl zI7 iyn Hsu">
<a data-test-id="search-guide" href="" title="Search for "living room colors"">
<div class="Jea Lfz XiG fZz gjz qDf zI7 iyn Hsu" style="white-space: nowrap; background-color:
rgb(162, 152, 139);">
<div class="tBJ dyH iFc MF7 erh tg7 IZT mWe">Living</div>
</div>
</a>
</div>
</div>
</body>
</html>
'''
soup = BeautifulSoup(data, 'html.parser')
a = soup.select('div.Eqh.F6l.Jea.k1A.zI7.iyn.Hsu a')[0]
print(a['title'])
Here is what I have tried that does not work (getting HTML from URL and then parsing):
import requests
from bs4 import BeautifulSoup
vgm_url = 'https://www.pinterest.com/search/pins/?q=skin%20care'
html_text = requests.get(vgm_url).text
soup = BeautifulSoup(html_text, 'html.parser')
a = soup.select('div.Eqh.F6l.Jea.k1A.zI7.iyn.Hsu a')
for a in soup.select('div.Eqh.F6l.Jea.k1A.zI7.iyn.Hsu a'):
print(a['title'])
I'm not getting any error, it does not print anything. I appreciate your help.
html_texthas the text that you want? That is, it contains the contents you want instead of, say, a login page?