Extracting Text from Span Tag using BeautifulSoup

738 Views Asked by Batool At 05 September 2021 at 23:51

I am trying to extract the estimated monthly cost of "$1,773" from this url:

https://www.zillow.com/homedetails/4651-Genoa-St-Denver-CO-80249/13274183_zpid/

Upon inspecting that part of the page, I see this data:

<div class="sc-qWfCM cdZDcW">
   <span class="Text-c11n-8-48-0__sc-aiai24-0 dQezUG">Estimated monthly cost</span>
   <span class="Text-c11n-8-48-0__sc-aiai24-0 jLucLe">$1,773</span></div>

To extract $1,773, I have tried this:

from bs4 import BeautifulSoup
import requests

url = 'https://www.zillow.com/homedetails/4651-Genoa-St-Denver-CO-80249/13274183_zpid/'
headers = {"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"}

soup = BeautifulSoup(requests.get(url, headers=headers).content, "html")

print(soup.findAll('span', {'class': 'Text-c11n-8-48-0__sc-aiai24-0 jLucLe'}))

This returns a list of three elements, with no mention of $1,773.

[<span class="Text-c11n-8-48-0__sc-aiai24-0 jLucLe">$463,300</span>, 
<span class="Text-c11n-8-48-0__sc-aiai24-0 jLucLe">$1,438</span>, 
<span class="Text-c11n-8-48-0__sc-aiai24-0 jLucLe">$2,300<!-- -->/mo</span>]

Can someone please explain how to return $1,773?

Original Q&A

There are 2 best solutions below

Artsiom Liaver On 06 September 2021 at 00:03

I think you have to find the first parent element. for example:

parent_div = soup.find('div', {'class': 'sc-fzqBZW bzsmsC'})
result = parent_div.findAll('span', {'class': 'Text-c11n-8-48-0__sc-aiai24-0 jLucLe'})

NoorJafri On 06 September 2021 at 19:48

While parsing a web page we need to separate components of the page in the way they are rendered. There are components that are statically or dynamically rendered. The dynamic content also takes some time to load, as the page calls for backend API of some sort.

Read more here

I tried parsing your page using Selenium ChromeDriver

import time

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.get("https://www.zillow.com/homedetails/4651-Genoa-St-Denver-CO-80249/13274183_zpid/")
time.sleep(3)
time.sleep(3)
el = driver.find_elements_by_xpath("//span[@class='Text-c11n-8-48-0__sc-aiai24-0 jLucLe']")

for e in el:
    print(e.text)

time.sleep(3)
driver.quit()

#OUTPUT
$463,300
$1,773
$2,300/mo

Extracting Text from Span Tag using BeautifulSoup

There are 2 best solutions below

Related Questions in HTML

Related Questions in WEB-SCRAPING

Related Questions in BEAUTIFULSOUP

Related Questions in ZILLOW

Trending Questions

Popular # Hahtags

Popular Questions