Methods to webscrape Prizepicks for all NBA props

2.9k Views Asked by At

I have been looking EVERYWHERE for some form of help on any method on python to web scrape all nba props from app.prizepicks.com. I came down to 2 potential methods: API with pandas and selenium. I believe prizepicks recently shut down their api system to restrain from users from scraping the nba props, so to my knowledge using selenium-stealth is the only way possible to web scrape the prizepicks nba board. Can anyone please help me with, or provide a code that scrapes prizepicks for all nba props? The information needed would be the player name, prop type (such as points, rebounds, 3-Pt Made, Free throws made, fantasy, pts+rebs, etc.), prop line (such as 34.5, 8.5, which could belong to a prop type such as points and rebounds, respectively). I would need this to work decently quickly and refresh every set amount of minutes. I found something similar to what i would want provided in another thread by 'C. Peck'. Which I will provide (hopefully, i dont really know how to use stackoverflow). But the code that C. Peck provided does not work on my device and i was wondering if anyone here write a functional code/fix this code to work for me. I have a macbook pro so i dont know if that affects anything.

EDIT After a lot of trial and error, and help from the thread, I have managed to complete the first step. I am able to webscrape from the "Points" tab on the nba league of prizepicks, but I want to scrape all the info from every tab, not just points. I honestly dont know why my code isnt fully working, but i basically want it to scrape points, rebounds, assists, fantasy, etc... Let me know any fixes i should do to be able to scrape for every stat_element in the stat_container, or other methods too! Ill update the code below:

EDIT AGAIN it seems like the problem lies in the "stat-container" and "stat-elements". I checked to see what elements the "stat-elements" has, and it is only points. I checked to see what elements the "stat-container" has, and it gave me an error. I believe if someone helps me with that then the problem will be fixed. This is the error it gives when i try to see the elements inside of "stat-container": line 27, in for element in stat_container: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError: 'WebElement' object is not iterable

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
import pandas as pd
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://app.prizepicks.com/")


driver.find_element(By.CLASS_NAME, "close").click()


time.sleep(2)

driver.find_element(By.XPATH, "//div[@class='name'][normalize-space()='NBA']").click()

time.sleep(2)

# Wait for the stat-container element to be present and visible
stat_container = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CLASS_NAME, "stat-container")))

# Find all stat elements within the stat-container
stat_elements = driver.find_elements(By.CSS_SELECTOR, "div.stat")

# Initialize empty list to store data
nbaPlayers = []

# Iterate over each stat element
for stat in stat_elements:
    # Click the stat element
    stat.click()

    projections = WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".projection")))

    for projection in projections:

        names = projection.find_element(By.XPATH, './/div[@class="name"]').text
        points= projection.find_element(By.XPATH, './/div[@class="presale-score"]').get_attribute('innerHTML')
        text = projection.find_element(By.XPATH, './/div[@class="text"]').text
        print(names, points, text)

        players = {
            'Name': names,
            'Prop':points, 'Line':text
            }

        nbaPlayers.append(players)
   

df = pd.DataFrame(nbaPlayers)
print(df)

driver.quit()
         
1

There are 1 best solutions below

22
On

Answer updated with the working code. I made some small changes

  • Replaced stat_elements with categories, which is a list containing the stat names.

  • Loop over categories and click the div button with label equal to the current category name

  • Add .replace('\n','') at the end of the text variable

.

driver.get("https://app.prizepicks.com/")

driver.find_element(By.CLASS_NAME, "close").click()
time.sleep(2)
driver.find_element(By.XPATH, "//div[@class='name'][normalize-space()='NBA']").click()
time.sleep(2)

# Wait for the stat-container element to be present and visible
stat_container = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CLASS_NAME, "stat-container")))

# Find all stat elements within the stat-container
# i.e. categories is the list ['Points','Rebounds',...,'Turnovers']
categories = driver.find_element(By.CSS_SELECTOR, ".stat-container").text.split('\n')

# Initialize empty list to store data
nbaPlayers = []

# Iterate over each stat element
for category in categories:
    # Click the stat element
    line = '-'*len(category)
    print(line + '\n' + category + '\n' + line)
    driver.find_element(By.XPATH, f"//div[text()='{category}']").click()

    projections = WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".projection")))

    for projection in projections:

        names = projection.find_element(By.XPATH, './/div[@class="name"]').text
        points= projection.find_element(By.XPATH, './/div[@class="presale-score"]').get_attribute('innerHTML')
        text = projection.find_element(By.XPATH, './/div[@class="text"]').text.replace('\n','')
        print(names, points, text)

        players = {'Name': names, 'Prop':points, 'Line':text}

        nbaPlayers.append(players)

pd.DataFrame(nbaPlayers)