How to get iframe source from page_source

3.1k Views Asked by At

Hello I try to extract the link from page_source and my code is:

from bs4 import BeautifulSoup
from selenium import webdriver
import time
import html5lib

driver_path = r"C:\Users\666\Desktop\New folder (8)\chromedriver.exe"
driver = webdriver.Chrome(driver_path)
driver.implicitly_wait(10)

driver.get("https://www.milversite.club/milver/outsiders-1x01-video_060893d7a.html")
try:
    time.sleep(4)
    iframe = driver.find_elements_by_tag_name('iframe')
    for i in range(0, len(iframe)):
        f = driver.find_elements_by_tag_name('iframe')[i]
        driver.switch_to.frame(i)
        #  your work to extract link
        text = driver.find_element_by_tag_name('body').text
        print(text)
        driver.switch_to.default_content()

    output = driver.page_source

    print (output)

finally:
    driver.quit();

And now I want to scrape just this link LINK

2

There are 2 best solutions below

0
On BEST ANSWER

Try the below script to get the link you wanna parse. You didn't need to switch to iframe to get the link. Hardcoded delay is always the worst choice to parse any dynamic content. What if the link apprears after 5 seconds. I used Explicit Wait within the below script to make it robust.

from selenium import webdriver
from selenium.webdriver.support import ui

driver = webdriver.Chrome()
wait = ui.WebDriverWait(driver, 10)
driver.get("https://www.milversite.club/milver/outsiders-1x01-video_060893d7a.html")

elem = wait.until(lambda driver: driver.find_element_by_id("iframevideo"))
print(elem.get_attribute("src"))

driver.quit()

Output:

https://openload.co/embed/8wVwFQEP1Sw
0
On

Try with

element = driver.find_element_by_id('iframevideo')
link = element.get_attribute('src')