Is it possible to web scrape a blob URL from a website in python?

2k Views Asked by Riku At 14 August 2022 at 21:29

I am trying to extract a CSV file which is stored in a blob URL in this domain using beautiful soup: https://worldpopulationreview.com/country-rankings/exports-by-country

Here's my code:

exports  = pd.read_csv(io.StringIO(requests.get(BeautifulSoup(requests.get('https://worldpopulationreview.com/country-rankings/exports-by-country').text,\
        'html.parser').find_all(download="csvData.csv"))))

What I got was an exception and NO blob link in the href. The blob url does exist when I inspect the html on my browser:

I decided to just do a get request for the blob url itself instead of scraping it since the href does not show the blob url but this exception appears:

requests.exceptions.InvalidSchema: No connection adapters were found for 'blob:https://worldpopulationreview.com/850ac28e-9cd9-46b6-9423-e96a0bd7e938'

Is there a way to web scrape blob URLs?

Original Q&A

There are 1 best solutions below

cuzi On 14 August 2022 at 22:28 BEST ANSWER

These blob URLs are created only in the browser, usually with Javascript, they don't exist on the server at all. So you cannot download them with requests.

You could use a Javascript script in the browser console to get the content, here is an example on how to fetch the blob URL in Javascript: https://stackoverflow.com/a/52410044/

If you need to do this automatically, you can possibly create a userscript to do it or use an automation tool like AutoHotkey to click th download link automatically.

Is it possible to web scrape a blob URL from a website in python?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in WEB-SCRAPING

Related Questions in BEAUTIFULSOUP

Related Questions in BLOB

Related Questions in BLOBURLS

Trending Questions

Popular # Hahtags

Popular Questions