scraping rotten tomatoes 12 years a slave movie

187 Views Asked by At

I'm trying to scrape the number of audience ratings from this page https://www.rottentomatoes.com/m/12_years_a_slave (which is 100,000+) using python selenium. I tried all kinds of selenium locators but every time i get NoSuchElementException: error. Here is my code :

import selenium
from selenium import webdriver

driver = webdriver.Chrome('path.exe')
url = 'https://www.rottentomatoes.com/m/12_years_a_slave'
driver.get(url)
    
def scrape_dom(element):
    shadow_root = driver.execute_script('return 
        arguments[0].shadowRoot', element)
    retuen shadow_root

host = driver.find_element_by_tag_name('score-board')
root_1 = scrape_dom(host)
views = root_1.find_element_by_link_text(
        '/m/12_years_a_slave/reviews?type=user&intcmp=rt-' + \
        'scorecard_audience-score-reviews')

I also tried xpath , css_selector but always error.may you tell me what's wrong with my code?

3

There are 3 best solutions below

2
AudioBubble On

A simple CSS selector work here.

from selenium import webdriver

driver = webdriver.Chrome()
url = 'https://www.rottentomatoes.com/m/12_years_a_slave'
driver.get(url)

print(driver.find_element_by_css_selector('a[slot=audience-count]').text)

I get 100,000+ Ratings printed out to my console.

2
itronic1990 On

See if this xpath works:-

driver.find_element_by_xpath(".//a[@data-qa='audience-rating-count']").text
1
QHarr On

You don't need selenium. You can use requests and bs4. Also, you can use a faster css class selector, rather than slower attribute selector given in other answers so far.

import requests
from bs4 import BeautifulSoup as bs

r = requests.get('https://www.rottentomatoes.com/m/12_years_a_slave')
soup = bs(r.content, 'lxml')
soup.select_one('.scoreboard__link--audience').text