For this example specifically, I'm trying to make it so that I can automatically scrape the direct stream url from here which can be found in the video iframe this (iframe["src"])
The default server when first entering the webpage is "server40", or VidStream but in order to get the working url, I need the src from the video iframe from Streamtape but in order to switch servers, you are required to click on the desired server. In my case, I want to click on Streamtape when I enter the website, but that requires posting a requests or whatever which I am not familiar with.
I've spent awhile messing around with it all using HTMLSession
from requests_html
but haven't gotten very far.
I have gotten to the point where I found the key for the post requests which I found by going into the Network
tab in Firefox and looking under requests
for the Post
request.
This is what I have so far:
from requests_html import HTMLSession
from bs4 import BeautifulSoup
from requests import get
def grabpost(ld, new={}): # Create dict then grab key from 'server40'
[new.update({x[0]:x[1]}) for x in ld]; return new.get("40")
def grabhtml(session, url): # Auto requests_html for html/js only
res = session.get(url); res.html.render(); session.close(); return res
AJAX = "https://www12.9anime.to/ajax/user/watching"
URL = "https://www12.9anime.to/watch/descending-stories-showa-genroku-rakugo-shinju.xojv/ep-2"
headers = {"content-type": "application/json"}
TYPE = URL.rsplit(".")[-1].split("/")[0]
### Grab key for 'Streamtape' server [server40]
s1 = HTMLSession()
soup = BeautifulSoup(grabhtml(s1, URL).html.html, "lxml")
servers = soup.find("a", class_="active")["data-sources"].replace('"', "")[1:-1].split(",")
KEY = grabpost([x.split(":") for x in servers])
data = {TYPE:KEY}
### Post onto AJAX then grab URL during s2
s2 = HTMLSession()
post = s2.post(url=AJAX, data=data, headers=headers)
res = grabhtml(s2, URL)
soup = BeautifulSoup(res.html.html, "lxml")
GOAL = soup.find("iframe")["src"].split("?")[0] # Should be "https://streamtape.com/e/QJ07RkJ0JDU0zZ4/"
# if GOAL is "https://vidstream.pro/e/ZLPG4V426Y7X", then the post didn't work and is still on "VidStream"
print(GOAL)