Currently scraping a real estate website that is using javascript. My process starts by scraping a list containing many different href links for single listings, appending these links to another list and then pressing the next button. I do this til the the next button is no longer clickable.
my problem is that after collecting all the listings (~13000 links) the scraper doesn't move onto the second part where it opens the links and gets the info I need. Selenium doesn't even open to move onto the first element of the list of links.
heres my code:
wait = WebDriverWait(driver, 10)
while True:
try:
element = wait.until(EC.element_to_be_clickable((By.LINK_TEXT, 'next')))
html = driver.page_source
soup = bs.BeautifulSoup(html,'html.parser')
table = soup.find(id = 'search_main_div')
classtitle = table.find_all('p', class_= 'title')
for aaa in classtitle:
hrefsyo = aaa.find('a', href = True)
linkstoclick = hrefsyo.get('href')
houselinklist.append(linkstoclick)
element.click()
except:
pass
After this I have another simple scraper that goes through the list of listings, opens them in selenium and collects data on that listing.
for links in houselinklist:
print(links)
newwebpage = links
driver.get(newwebpage)
html = driver.page_source
soup = bs.BeautifulSoup(html,'html.parser')
.
.
.
. more code here