1

I am using Python and Selenium to scrape a website. What I do is go to the homepage, type in a keyword, such as 1300746-79-5. On the resulting page, I am trying to scrape the data in the "pricing" section. Specifically, I need to get the "SKU-Pack Size" and "Price(USD)" information. But these information is Javascript encripted, so I cannot see them in the source code. I am wondering how I can achieve this.

I have written some code that gets me to the page of interest, but I still cannot see the javascript information. Here is what I have so far.

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pprint

# Create a new instance of the Firefox driver
driver = webdriver.Chrome('C:\Users\Rei\Desktop\chromedriver.exe')
driver.get("http://www.sigmaaldrich.com/united-states.html")

print driver.title
inputElement = driver.find_element_by_name("Query")

# type in the search
inputElement.send_keys("1300746-79-5")
inputElement.submit()

1 Answer 1

1

Everything you have done looks correct to me.

"SKU-Pack Size" and "Price(USD)" information are not "encrypted", but retrieved after JavaScript clicking action. All you need to do is to click product name or pricing link.

from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pprint

driver = webdriver.Chrome()
driver.get("http://www.sigmaaldrich.com/united-states.html")

print driver.title
inputElement = driver.find_element_by_name("Query")

# type in the search
inputElement.send_keys("1300746-79-5")
inputElement.submit()

pricing_link = driver.find_element_by_css_selector("li.priceValue a")
print pricing_link.text
pricing_link.click()

# then deal with the data you want
price_table = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.CSS_SELECTOR, ".priceAvailContainer tbody"))
)
print 'price_table.text: ' + price_table.text

driver.quit()
Sign up to request clarification or add additional context in comments.

6 Comments

Can you explain what I should do after doing the javascript clicking action? I still do not see the SKU and pricing information in the source code...I am thinking of using beautifulsoup to extract the specific elements. But my code does not return the product url, so I am clueless.
@user3788728: Hmm. Sounds like something is wrong. Can you see it in the UI (with your eyes)? What are your Selenium, Chrome, ChromeDriver versions?
my chrome version is 35.0.1916.153 , ChromeDriver is 2.9. I downloaded selenium this morning, so it should be the most recent version.
I can see it if I highlight the section and select view element. But if I just go into page source, they are not shown
@user3788728: Everything works fine here. I will post complete code above. I'm not experiencing any issues getting the data. Page source might have some limitations regarding AJAX calls. Any particular reasons you care about page source in the first place?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.