The goal: upload a file to FastAPI in chunks and process them without saving to hard drive. I have reviewed several similar topics. For example, issue73442335, issue70520522. In issue65342833 it was said, Starlette saves to hard drive any files large than 1 MB. As I'm going to forward my chunks to a cloud, I don't want any storaging of temp files. Is that possible to keep a chunk AND payload in memory before processing them further?
Here is my code.
app.py
from fastapi import Request
from pydantic import parse_obj_as
from pydantic.dataclasses import dataclass
import json
@dataclass
class Payload:
part: str
total_size: int
chunk_size: int
def __repr__(self):
return f"<FormData: part={self.part} total_size={self.total_size}"
@app.post('/upload')
async def upload(request: Request):
filename = request.headers.get('filename')
transferred_data = await request.form()
file_chunk = transferred_data['file_chunk'].file.read()
json_loads = json.loads(transferred_data['payload'])
payload = parse_obj_as(Payload, json_loads)
I hesitate that these two lines
transferred_data = await request.form()
file_chunk = transferred_data['file_chunk'].file.read()
keep data in memory as it is based on Starlette and contains UploadFile inside. BTW, how can I check how FastAPI uses hard drive?
test.py
import json
import os
import secrets
import requests
from requests_toolbelt import MultipartEncoder
CHUNK_SIZE = 1024 * 1024 * 5 # 5 MB
URL = 'http://localhost:8000/upload'
FILE_PATH = 'temp.txt'
counter = 1
with open(FILE_PATH, 'rb') as f:
while True:
chunk = f.read(CHUNK_SIZE)
if not chunk:
break
payload = {
'part': str(counter),
'total_size': str(os.path.getsize(FILE_PATH)),
'chunk_size': str(CHUNK_SIZE),
}
m = MultipartEncoder(
fields={
'payload': json.dumps(payload),
'file_chunk': ('file_chunk', chunk, 'application/octet-stream')
}
)
response = requests.post(URL, headers={'Content-Type': m.content_type, 'filename': f.name}, data=m, stream=True)
print(response.text)
counter += 1
Another possible solution could me an adjustment of Starlette class MultiPartParser and its attribute max_file_size. Perhaps, custom middleware would help.