1

I am scraping a YouTube page and find a open program codes online. The code runs and returns correct results. However, as I learn the code sentence by sentence, i find that I could not find the attribute in the source code. I searched for it in page source, inspect element view and copied and paste the raw code in word. Nowhere could I find it.

How did this happen?

Codes below:

soup=BeautifulSoup(result.text,"lxml")

# cannot find yt-lockup-meta-info anywhere......
view_element=soup.find_all("ul",class_="yt-lockup-meta-info")

totalview=0

for objects in view_element:
    view_list=obj.findChildren()
    for element in view_list:
        if element.string.endwith("views"):
            videoviews=element.text.replace("views","").replace(",","")
            totalview=totalview+int(videoviews)
            print(videoviews)

print("----------------------")

print("Total_Views"+str(totalview))

The attribute I searched for is "yt-lockup-meta-info".

The page source is here.

The original page.

0

1 Answer 1

1

I see a few problems, which I think might be cleared up if I saw the full code. However there are some things that need fixed within this block.

For example, this line should read:

for obj in view_element:

instead of:

for objects in view_element:

You are only referencing one "obj", not multiple objects when traversing through "view_element".

Also, there is no need to search for the word "views" when there is a class you can search directly.

Here is how I would address this problem. Hope this helps.

#Go to website and convert page source to Soup
response = requests.get('https://www.youtube.com/results?search_query=web+scraping+youtube')
soup = BeautifulSoup(response.text, 'lxml')
f.close()



videos = soup.find_all('ytd-video-renderer') #Find all videos
total_view_count = 0
for video in videos:
    video_meta = video.find('div', {'id': 'metadata'}) #The text under the video title
    view_count_text = video_meta.find_all('span', {'class': 'ytd-video-meta-block'})[0].text.replace('views', '').strip() #The view counter
    #Converts view count to integer
    if 'K' in view_count_text:
        video_view_count = int(float(view_count_text.split('K')[0])*1000)
    elif 'M' in view_count_text:
        video_view_count = int(float(view_count_text.split('M')[0])*1000000)
    elif 'B' in view_count_text:
        video_view_count = int(float(view_count_text.split('B')[0])*1000000000)
    else:
        video_view_count = int(view_count_text)
    print(video_view_count)
    total_view_count += video_view_count



print(total_view_count)
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you Luke. But it does not address problem. My problem is, there is the attribute "yt-lockup-meta-info" in the codes but I could not find it in the page source. How does this happen? Thanks.
Most likely, the website has probably updated it's tags and classes since the code has been written. So you would need to change the attribute name to the new attribute name. This happens occasionally when web-scraping and is why you should try not to use unique attribute IDs whenever possible.
Yes, I have run into this myself scraping YouTube many times. The short answer is a different response for different requests. For example, you can follow along with this tutorial (support.google.com/youtube/thread/17725319?hl=en) and revert back to YouTube's old layout. Where if you wanted to load more comments to a video you would have to click the button "load-more" rather than infinite scroll, and with that, as you can imagine, comes different information within your requests.text upon examining it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.