I have created an app that generates automatic reports for my team and I based on data located on multiple files (> 200). On my localhost streamlit app, I could input a few parameters (year, deployment number, etc) and the app would automatically use the correct files (3 out of 200 for each set of parameters) and generate the desired report.
However, now that I have deployed my app, I want it to select the desired files from a general OneDrive to which my whole team has access. This means the data would all be stored online in one location and the app would automatically only take the ones needed depending on the input parameters inserted by the user.
I have two problems:
1. I would like to open a csv file from a OneDrive URL. The method below gives me an error "urllib.error.HTTPError: HTTP Error 400: Bad Request":
'''
import base64
import urllib.request
import requests
from contextlib import closing
import csv
def create_onedrive_directdownload (onedrive_link):
data_bytes64 = base64.b64encode(bytes(onedrive_link, 'utf-8'))
data_bytes64_String = data_bytes64.decode('utf-8').replace('/','_').replace('+','-').rstrip("=")
resultUrl = f"https://api.onedrive.com/v1.0/shares/u!{data_bytes64_String}/root/content"
return resultUrl
onedrive_link = "https://my.sharepoint.com/:x:/s/myteam/..."
onedrive_direct_link = create_onedrive_directdownload(onedrive_link)
df = pd.read_csv(onedrive_direct_link)
r = requests.get(onedrive_link)
text = r.iter_lines()
reader = csv.reader(text, delimiter=',')
'''
2. I would like the app to select the right files depending on the first part of the URL only since the ending of the URL is a random list of numbers and letters but the beginning is predictable (all the files have a formated name which include the inputed parameters, i.e. year, deployment number, instrument). So what I am trying to do is something like this:
'''
folder_path = url to OneDrive
file_prefix_number = 062
year = 2013
if url contains "'+str(folder_path)+'/'+str(file_prefix_number)+'_ADP_'+str(year)+'-'+str(deployment)+'.csv'" then df = pd.read_csv(urlADP);
else ignore
'''
Any advice would be very welcome, I have been trying unsuccessfully many methods but I am afraid my python knowledges are not that good. Thank you in advance!