So I am trying to get some data from a website. And I'm having a hard time getting the data. I can get the player names but thats about it at this point. Been trying different things coming up short. Here is sample code that i'm trying to go through. Note that there are two tables (one for each team). And the class for each player alternates from "even" to "odd" or "odd" to "even" example html file below followed by my python script. I labeled which parts I want. I am also using python 2.7
`<table id="nbaGITeamStats" cellpadding="0" cellspacing="0">
<thead class="nbaGIClippers">
<tr>
<th colspan="17">Los Angeles Clippers (1-0)</th> <!-- I want team name -->
</tr>
</thead>
<tbody><tr colspan="17">
<td colspan="17" class="nbaGIBoxCat"><span>field goals</span><span>rebounds</span></td>
</tr>
<tr>
<td class="nbaGITeamHdrStatsNoBord" colspan="1"> </td>
<td class="nbaGITeamHdrStats">pos</td>
<td class="nbaGITeamHdrStats">min</td>
<td class="nbaGITeamHdrStats">fgm-a</td>
<td class="nbaGITeamHdrStats">3pm-a</td>
<td class="nbaGITeamHdrStats">ftm-a</td>
<td class="nbaGITeamHdrStats">+/-</td>
<td class="nbaGITeamHdrStats">off</td>
<td class="nbaGITeamHdrStats">def</td>
<td class="nbaGITeamHdrStats">tot</td>
<td class="nbaGITeamHdrStats">ast</td>
<td class="nbaGITeamHdrStats">pf</td>
<td class="nbaGITeamHdrStats">st</td>
<td class="nbaGITeamHdrStats">to</td>
<td class="nbaGITeamHdrStats">bs</td>
<td class="nbaGITeamHdrStats">ba</td>
<td class="nbaGITeamHdrStats">pts</td>
</tr>
<tr class="odd">
<td id="nbaGIBoxNme" class="b"><a href="/playerfile/paul_pierce/index.html">P. Pierce</a></td> <!-- I want player name -->
<td class="nbaGIPosition">F</td> <!-- I want position name -->
<td>14:16</td> <!-- I want this -->
<td>1-4</td> <!-- I want this -->
<td>1-2</td> <!-- I want this -->
<td>2-2</td> <!-- I want this -->
<td>+12</td> <!-- I want this -->
<td>1</td> <!-- I want this -->
<td>0</td> <!-- I want this -->
<td>1</td> <!-- I want this -->
<td>1</td> <!-- I want this -->
<td>3</td> <!-- I want this -->
<td>2</td> <!-- I want this -->
<td>0</td> <!-- I want this -->
<td>0</td> <!-- I want this -->
<td>0</td> <!-- I want this -->
<td>5</td> <!-- I want this -->
</tr>
<tr class="even">
<td id="nbaGIBoxNme" class="b"><a href="/playerfile/blake_griffin/index.html">B. Griffin</a></td> <!-- I want this -->
<td class="nbaGIPosition">F</td> <!-- I want this -->
<td>26:19</td> <!-- I want this -->
<td>5-14</td> <!-- I want this -->
<td>0-1</td> <!-- I want this -->
<td>1-1</td> <!-- I want this -->
<td>+14</td> <!-- I want this -->
<td>0</td> <!-- I want this -->
<td>5</td> <!-- I want this -->
<td>5</td> <!-- I want this -->
<td>2</td> <!-- I want this -->
<td>1</td> <!-- I want this -->
<td>1</td> <!-- I want this -->
<td>1</td> <!-- I want this -->
<td>1</td> <!-- I want this -->
<td>1</td> <!-- I want this -->
<td>11</td> <!-- I want this -->
</tr>
<tr class="odd">
<td id="nbaGIBoxNme" class="b"><a href="/playerfile/deandre_jordan/index.html">D. Jordan</a></td> <!-- I want this -->
<td class="nbaGIPosition">C</td> <!-- I want this -->
<td>26:27</td> <!-- I want this -->
<td>6-7</td> <!-- I want this -->
<td>0-0</td> <!-- I want this -->
<td>3-5</td> <!-- I want this -->
<td>+19</td> <!-- I want this -->
<td>1</td> <!-- I want this -->
<td>11</td> <!-- I want this -->
<td>12</td> <!-- I want this -->
<td>0</td> <!-- I want this -->
<td>1</td> <!-- I want this -->
<td>0</td> <!-- I want this -->
<td>2</td> <!-- I want this -->
<td>3</td> <!-- I want this -->
<td>0</td> <!-- I want this -->
<td>15</td> <!-- I want this -->
</tr>
<!-- And so on it will keep changing class from odd to even, even to odd -->
<!-- Also note there are to tables one for each team -->
<!--this is he table id>>> <table id="nbaGITeamStats" cellpadding="0" cellspacing="0"> -->`
This was long but i wanted to give an example of the classes switching up here is my python script I plan to use a dictionary to save the data once I actually scrape it successfully.
import urllib
import urllib2
from bs4 import BeautifulSoup
import re
gamesForDay = ['/games/20151002/DENLAC/gameinfo.html']
for game in gamesForDay:
url = "http://www.nba.com/"+game
page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page)
for tr in soup.find_all('table id="nbaGITeamStats'):
tds = tr.find_all('td')
print tds