Using "Developer Tools" in Chrome I found that your page loads file
http://apps.cnbc.com/view.asp?country=US&uid=stocks/ownership&symbol=YHOO.O
which has expected data
from bs4 import BeautifulSoup
import requests
url = 'http://apps.cnbc.com/view.asp?country=US&uid=stocks/ownership&symbol=YHOO.O'
response = requests.get(url).content
soup = BeautifulSoup(response, 'lxml')
for row in soup.find_all('table', {'class': 'shareholders dotsBelow'} ):
print(row.text)
Result (it returns many empty lines because HTML has many "\n"):
Name
Shares Held
Position Value
Percentage ofTotal Holdings
since 2/3/16
% Ownedof SharesOutstanding
TurnoverRating
Filo (David)
70.7M
$2,351,860,831
+9%
7.5%
Low
The Vanguard ...
49.2M
$1,422,524,414
+6%
5.2%
Low
State Street ...
34.4M
$993,071,914
+5%
3.6%
Low
BlackRock ...
32.3M
$935,173,655
+4%
3.4%
Low
Fidelity ...
24.7M
$714,307,904
+3%
2.6%
Low
Goldman Sachs & ...
18.6M
$538,561,672
+2%
2.0%
Low
Mason Capital ...
16.4M
$472,832,995
+2%
1.7%
High
Capital Research ...
12.6M
$365,108,090
+2%
1.3%
Low
TIAA-CREF
10.9M
$315,255,311
+1%
1.2%
Low
T. Rowe Price ...
10.8M
$310,803,286
+1%
1.1%
Low
Name
Shares Held
Position Value
Percentage ofTotal Holdings
since 2/3/16
% Ownedof SharesOutstanding
InvestmentStyle
Vanguard Total ...
15.6M
$518,104,623
+2%
1.7%
Index
Vanguard 500 ...
10.6M
$352,795,106
+1%
1.1%
Index
Vanguard ...
9.4M
$312,902,098
+1%
1.0%
Index
SPDR S&P 500 ETF
8.8M
$292,985,112
+1%
0.9%
Index
PowerShares QQQ ...
7.6M
$252,776,000
+1%
0.8%
Index
Statens ...
6.7M
$338,173,390
+1%
0.7%
Core Value
First Trust DJ ...
5.6M
$186,778,215
+1%
0.6%
Index
Janus Twenty Fund
5.2M
$150,966,054
+1%
0.6%
Growth
CREF Stock Account
5.0M
$195,517,452
+1%
0.5%
Core Growth
Vanguard Growth ...
4.8M
$159,879,157
+1%
0.5%
Index
EDIT: better version
from bs4 import BeautifulSoup
import requests
url = 'http://apps.cnbc.com/view.asp?country=US&uid=stocks/ownership&symbol=YHOO.O'
response = requests.get(url).content
soup = BeautifulSoup(response, 'lxml')
for tbody in soup.find_all('tbody', id="tBody_institutions"):
trs = tbody.find_all('tr')
for tr in trs:
tds = tr.find_all('td')
print(tds[0].text, tds[1].text, tds[2].text)
and result
Filo (David) 70.7M $2,351,860,831
The Vanguard ... 49.2M $1,422,524,414
State Street ... 34.4M $993,071,914
BlackRock ... 32.3M $935,173,655
Fidelity ... 24.7M $714,307,904
Goldman Sachs & ... 18.6M $538,561,672
Mason Capital ... 16.4M $472,832,995
Capital Research ... 12.6M $365,108,090
TIAA-CREF 10.9M $315,255,311
T. Rowe Price ... 10.8M $310,803,286
Seleniumwhich control browser and browser can run javascript. (Requests and BS don't run javascript). Or you have to analyze files send from server to browser and find file which has expected data and get its url. You can useDeveloper toolsin Chrome orFirebugin Firefox to analyze it manually.tdsin one list - tryprint( len(tds) )and you see60, so you havetds[59].text. Firstfindall("tr")then useforloop to searchtdin everytr- see new code in my answer.