2

I'm using Selenium + Python to scrape match results on a Battlefy page for later manipulation and entering into a database. I'm trying to scrape the names of the teams and the results using Selenium because the dynamically loading JS requires me to use a headless browser. However, I'm trying to get the text of each college using the class name, but using Selenium's find_elements_by_class_name method doesn't seem to be working.

Web page: https://battlefy.com/college-league-of-legends/2020-north-conference/5de98dd4196d1311d9e6edbd/stage/5e23b6e395e72856dac06997/bracket/1

Current code:

>>> chrome_path = r"C:\Users\...\chromedriver.exe"
>>> driver = webdriver.Chrome(chrome_path)
>>> driver.get("https://battlefy.com/college-league-of-legends/2020-north-conference/5de98dd4196d1311d9e6edbd/stage/5e23b6e395e72856dac06997/bracket/1")
>>> team = driver.find_elements_by_class_name("team-name overflow-ellipsis float-right")
>>> for item in teams:
    print(item.text)

Which does not print anything and returns an empty array. I must be doing something incorrectly. How can I scrape each team name's text when it's covered by a class name?

2
  • 3
    try with team = driver.find_elements_by_class_name("team-name.overflow-ellipsis.float-right") Commented Feb 14, 2020 at 20:47
  • That did the trick. For future reference, when referencing by class name, will I always need to place dots in place of white space? @supputuri Commented Feb 14, 2020 at 20:56

2 Answers 2

1

team-name overflow-ellipsis float-right is combination of classes and when you use find_elements_by_class_name/find_element_by_class_name method, the locator will be converted to CSS internally but selenium library. Hence you have to mask all the spaces (white spaces) with ..

Try with below.

team = driver.find_elements_by_class_name("team-name.overflow-ellipsis.float-right")

Edit 1:

Here is the selenium implementation, where we can see the locator is pre-pended with . and it uses By.CSS_SELECTOR internally. So, we don't have to add . for the first class name.

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

Provided the explanation for reference,hope that explanation helps you to understand the logic behind using ..
team-name.overflow-ellipsis.float-right as class_name is actually incorrect. It should have been .team-name.overflow-ellipsis.float-right as css_selector
@DebanjanB Added the explanation to the answer.
0

To scrape the names of the teams using Selenium and Python you have to induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR:

    driver.get("https://battlefy.com/college-league-of-legends/2020-north-conference/5de98dd4196d1311d9e6edbd/stage/5e23b6e395e72856dac06997/bracket/1")
    print([my_elem.text for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, ".team-name.overflow-ellipsis.float-right")))])
    
  • Using XPATH:

    driver.get("https://battlefy.com/college-league-of-legends/2020-north-conference/5de98dd4196d1311d9e6edbd/stage/5e23b6e395e72856dac06997/bracket/1")
    print([my_elem.text for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='team-name overflow-ellipsis float-right']")))])
    
  • Console Output:

    ['Cougars', 'University of Illinois at Urbana-Champaign', 'Maryville Esports', 'Michigan State University', 'Purdue University', 'Illinois Wesleyan Titans', 'UMN Varsity Gold', 'UC LoL A Team', 'Arbor Esports', 'CWRU 300 Spartans', 'Bethany Esports', 'BGC at OSU', 'University of Wisconsin', 'CGC UIC', 'Indiana University - Purdue University Indianapolis - High Tempo Gaming', 'Missouri State University', 'KSU Wildcats', 'University of Manitoba Bisons', 'Nebraska', 'S&T eSports', 'Illinois State University - Redbird Esports', 'WUSTL Bears', 'University of Iowa A Team', 'TSUES', 'Division 2+', 'Grizzlies', 'Principia College esports', 'Northwestern Varsity', 'Wright State University - Raiders', 'Milwaukee School of Engineering - Raiders', 'UPIKE Esports', 'UMDads', 'Jayhawk Esports', 'NKU Esports', 'Warriors', 'Spartans', 'ND Lol', 'SDSU Team Alpha', 'Rose-Hulman', 'SIUe eSports', 'UND', 'MTU GOLD', 'Polar Bears', 'Purdue Fort Wayne Esports', 'CSU LOL', 'Aquinas Esports', 'Shawnee State Bears', 'Lewis Flyers', 'NDSU League of Legends Club', 'South Dakota Mines - Hardrockers', 'GVSU Laker Legends', 'G&E Club @ Iowa State University', 'MVC Vikings', 'Match from North (Dukes)']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.