Extracting URLs from Websites using PyQuery

28 Views Asked by At

I have this code:

from pyquery import PyQuery as pq
import requests

url1 = "https://economy-finance.ec.europa.eu/economic-forecast-and-surveys/business-and-consumer-surveys/download-business-and-consumer-survey-data/subsector-data_en"
content = requests.get(url1).content
doc = pq(content)

items = doc(".ecl-table__cell+ .ecl-table__cell a")

There are many URLs extracted here.

print(items)

I am able to select only the first one with this.

industry_subsectors_sa = items.attr('href')
industry_subsectors_sa 

How to extract and store other URLs in other variables?

I also have another page, but same techniques does not work here.

url1 = "https://economy-finance.ec.europa.eu/economic-forecast-and-surveys/business-and-consumer-surveys/download-business-and-consumer-survey-data/time-series_en"
content1 = requests.get(url1).content
doc1 = pq(content1)

I am trying to get the All surveys - Seasonally Adjusted Data - zip URL from here.

items1 = doc1("#all-surveys+ .ecl-table-responsive .ecl-table__row:nth-child(1) a")

However it does not return anything.

print(items1)
0

There are 0 best solutions below