I'm encountering a memory-related issue while trying to transcribe audio files using Flask and the whisper_timestamped (https://github.com/linto-ai/whisper-timestamped) library. I've deployed my app on Render and I'm using a /transcribe endpoint to handle audio transcription requests. However, I'm consistently running into the following error:
[ERROR] Worker (pid:67) was sent SIGKILL! Perhaps out of memory?
I've provided the relevant code snippet below:
# Flask app setup and imports
@app.route('/transcribe', methods=['POST'])
def transcribe():
if 'audio' not in request.files:
return jsonify({'error': 'No audio file provided.'}), 400
audio_file = request.files['audio']
temp_dir = tempfile.mkdtemp()
temp_audio_path = os.path.join(temp_dir, "temp_audio.wav")
audio_file.save(temp_audio_path)
result = transcribe_audio(temp_audio_path)
os.remove(temp_audio_path)
os.rmdir(temp_dir)
return jsonify(result), 200
if __name__ == "__main__":
app.run(debug=True)
I'm using the whisper_timestamped library to transcribe audio and obtain word timestamps. The error occurs when the worker process handling the request is terminated due to memory issues. I suspect that the library or model might be memory-intensive, causing the app to exceed its memory allocation.
Can anyone offer suggestions on how to diagnose and resolve this memory-related error? Are there any strategies or best practices I can follow to prevent my app from running out of memory during audio transcription?
Any help or insights would be greatly appreciated.