How to extract location link using selenium webdriver python

Question

I was using the below code snippet to extract the "Locations" link using selenium webdrive in python, but not able to extract the link, was only able to extract the text ("Locations"). Can anyone help me in this?

Link to extract from: https://www.thomasnet.com/company/siemens-corporation-10035100/profile?cov=NA&which=comp&what=Siemens+Corporation&cid=10035100&searchpos=1

Code Snippet used:

lnk_content = driver.find_element(By.XPATH,"//*[@id='__next']/div/div[2]/div/div[1]/div/div/button/span")
lnk = lnk_content.get_attribute("href")
print(lnk)

There is no href attribute in the targeted span element. Obviously nothing will be extracted. — Shawn
– Shawn, Commented Dec 5, 2023 at 11:40
But there is a link in the "Locations" button, is there a way to extract that? — Goutam
– Goutam, Commented Dec 5, 2023 at 12:48
Clicking that links works because there is a click event listener attached to the <button> element that contains that Location link. There is no link to extract contained in the HTML itself. This is probably designed explicitly to prevent web scraping. Using Selenium, you could perhaps click the link and then read the URL of the page that gets loaded. — larsks
– larsks, Commented Dec 5, 2023 at 13:16

Shawn · Accepted Answer · 2023-12-05 13:59:14Z

Agree with larks comment. See the below code to click on Location element and extract the URL which gets loaded.

Code:

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get("https://www.thomasnet.com/company/siemens-corporation-10035100/profile?cov=NA&which=comp&what=Siemens+Corporation&cid=10035100&searchpos=1")
driver.maximize_window()
wait = WebDriverWait(driver, 10)

wait.until(EC.element_to_be_clickable((By.XPATH, "(//span[text()='Locations'])[2]"))).click()
time.sleep(5)
location_url = driver.current_url
print(location_url)

Console:

https://www.thomasnet.com/company/siemens-corporation-10035100/branches?pg=1

Process finished with exit code 0

larsks · Accepted Answer · 2023-12-05 13:51:27Z

1

Clicking on that link makes new content visible in the existing document. You can click on the link with code like this:

lnk = driver.find_element(By.XPATH,'//button[span[text() = "Locations"]]')

# See https://stackoverflow.com/a/56194349/147356
driver.execute_script("arguments[0].click();", lnk)

Then you can retrieve the location table:

locations = driver.find_element(By.XPATH, "//div[h2[text() = 'Locations']]/following-sibling::div/table")

And iterate over the rows:

for row in locations.find_elements(By.TAG_NAME, 'tr'):
  ...

answered Dec 5, 2023 at 13:51

larsks

318k49 gold badges473 silver badges482 bronze badges

Collectives™ on Stack Overflow

How to extract location link using selenium webdriver python

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related