Trouble finding Youtube view count

82 Views Asked by At

following a youtube tutorial on scraping youtube views and video date [https://www.youtube.com/watch?v=Cc3mMH8XWC4]

I made a dataframe of every video, it has the views, clean_views, video_url, video_age and title of over 1000 videos, while following a previous tutorial.

(https://www.youtube.com/watch?v=2s6Oxh3JkG0) video I'm trying to parse. if I can successfully get the views and video_date, I plan to loop through all of my dataframe, to update the views and video_date.

Youtube has updated their viewcount and made it difficult to parse, compared to the tutorial I'm watching.

from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By

from webdriver_manager.chrome import ChromeDriverManager

-Selenium to webscrape. -chromeservice and ChromeDriverManager for webdriver. -Imported Keys so machine presses "End" button to go all -the way down the page, to load more vids. -Imported By so I dont get "name By is not found" error.

driver = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()))
for url in youtube_df['video_url']:
    driver.get(url)
    break

telling driver to go through all the urls in my dataframe. Its just focusing on the first one(which is this vid(https://www.youtube.com/watch?v=2s6Oxh3JkG0)), cause I wrote break.

html = driver.page_source saving everything driver knows about page into variable called html

soup = BeautifulSoup(html,'html.parser') using beautiful soup to parse 'html' variable

soup.find_all('div',{'id':'view-count','class':'style-scope ytd-watch-info-text'}) this is where I thought the view count was. In the Arial label it told me the views. I think theres code in there to make the Arial Hidden.

What can I do to get the view count of video, and get the exact date of the video?

tried looking it up on Youtube, all the tutorials are too old.

1

There are 1 best solutions below

0
Md. Fazlul Hoque On BEST ANSWER

The error you're encountering suggests that the ('div',{'id':'view-count','class':'style-scope ytd-watch-info-text'} call is returning None, meaning it's not finding any elements matching the given criteria. This is likely because the class name 'style-scope ytd-watch-info-text' isn't unique meaning your required data isn't inside your selection.You don't need to apply for loop to get the correct html element selection. You can try to get the desired data by applying the element selection from my code. Now it's working fine.

Script:

import time
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from bs4 import BeautifulSoup


options = webdriver.ChromeOptions()
options.add_argument("start-maximized")

driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),options=options)
video_url=["https://www.youtube.com/watch?v=2s6Oxh3JkG0"]#your urls
for url in video_url:
    driver.get(url)
    time.sleep(5)

soup = BeautifulSoup(driver.page_source, 'lxml')
view_count = soup.select_one('span.bold.style-scope.yt-formatted-string:first-child').get_text()
actual_view_count = soup.select_one('.view-count').get_text()
date = soup.select_one('span.bold.style-scope.yt-formatted-string:nth-child(3)').get_text()
meta_tags = soup.find_all('meta')
upload_date = ""
for tag in meta_tags:
    if tag.get('itemprop') == 'uploadDate':
        upload_date = tag.get('content')
        break

print('View count:' , view_count)
print('Actuai View count:', actual_view_count)
print('Date:', date)
print('Upload date:', upload_date)

Output:

View count: 12K views
Actuai View count: 12,855 views
Date: 1 day ago
Upload date: 2024-02-17T09:00:02-08:00