Unable to download CSV file from web URL with runtime using python

29 Views Asked by At

I have a need to download CSV file programmatically using python, from public healthcare website: link and write it to X folder.

Here is my sample code:

import json
import urllib.request
import requests
url = "https://data.chhs.ca.gov/dataset/514c5381-f3dc-4c4a-a9de-a4df9405f046/resource/d652b210-ec3d-4a92-b7e0-e55c3dcbc7dc/download/medi_cal_ffs_provider_list_3_19_2024.csv"
r = requests.get(url, stream=True)
file_path="C:/Users/xxx/Downloads/medi_cal_ffs_provider_list.csv"
with open(file_path, 'wb') as f:
     f.write(r.content)

This is a repetitive task to download file on a weekly basis, but highlighted strings in the below URL are runtime. Any thoughts how to resolve this problem?

https://data.chhs.ca.gov/dataset/**514c5381-f3dc-4c4a-a9de-a4df9405f046**/resource/**d652b210-ec3d-4a92-b7e0-e55c3dcbc7dc**/download/**medi_cal_ffs_provider_list_3_19_2024.csv**"

It works with hardcoded URL, but don't know how to parse the URL runtime without knowing substrings of URL.

1

There are 1 best solutions below

0
CyberTruck On

This code will download the file from the url without explicitly passing the file name. So whenever the new file is available you can run this code and it will automatically download the latest file for you.All you have to do is download the lxml package. I hope this will solve your problem.

from lxml import html
import requests

downloadURL = r"https://data.chhs.ca.gov/dataset/profile-of-enrolled-medi-cal-fee-for-service-ffs-providers/resource/d652b210-ec3d-4a92-b7e0-e55c3dcbc7dc"
fileLocalPath="C:/temp/tmp/medi_cal_ffs_provider_list.csv"
fileURL = ""

page = requests.get(downloadURL)
webpage = html.fromstring(page.content)
urlList = webpage.xpath('//a/@href')

for url in urlList:
    if url.endswith('.csv'):
        fileURL = url
        break

with requests.get(fileURL, stream=True, verify=False) as r:
    with open(fileLocalPath, "wb") as f:
        for chunk in r.iter_content(chunk_size=8196):
            f.write(chunk)