In FastAPI, I receive a request from a client and make a determination on where I need to "forward" the request to. I make an underlying API request to another server using httpx and want to stream the request body that I receive from the client to that underlying server versus needing to load the entire request body into memory when I'm not performing any processing on that request. This actually works relatively well however, when sending multiple requests in parallel I get a RuntimeError that the stream has been consumed.
File \"/usr/local/lib/python3.11/site-packages/starlette/requests.py\", line 218, in stream\n raise RuntimeError(\"Stream consumed\")
Functionally the implementation is similar to this:
from fastapi import FastAPI, Request
from httpx import AsyncClient
from fastapi.responses import JSONResponse
app = FastAPI()
async def forward_request(request: Request):
client = AsyncClient()
response = await client.request(
'POST', f'http://www.example.com/{request.url.path}', content=request.stream(),
)
if not response.is_success:
raise Exception('Error!')
# Processing on the response performed here.
return JSONResponse(response.json(), status_code=response.status_code)
app.add_route('/forward/{path:path}', forward_request, ['POST'])
How do I ensure that the request body that I receive from the client can successfully be streamed to an underlying service to avoid loading the entire request body into memory, and how do I ensure that parallel requests won't cause issues loading the stream?