I am trying to implement an audio streaming transcription service using django and websockets. The implementation works but the chunks get corrupted after some time like after tenth or eleventh chunk of transcription, while debugging, I discovered that the webm audio bytes extension was missing (which means some other metadata might also be missing in those chunks which makes me unable to transcribe those chunks.
class TranscriptConsumer(AsyncWebsocketConsumer):
"""
Server side implementation of the audio streaming service
"""
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.audio_buffer = b'' # Variable to store accumulated audio data
I used the audio_buffer variable to concatenate all audio bytes chunks still in an effort to prevent the corruption of data because before using that variable to concatenate the chunks, I experienced the audio bytes corruption after the first chunk. I use below to receive and concatenate the chunks
async def receive(self, text_data=None, bytes_data=None):
if bytes_data:
# Combine incoming bytes_data with the accumulated audio_buffer
self.audio_buffer += bytes_data
# Check if the combined data is sufficient for processing
if len(self.audio_buffer) >= 10000:
await self.process_audio()
And below to transcribe the chunks
# Split the combined data into chunks
max_chunk_size = 1024 * 1024 # 1MB chunk size
chunks = [self.audio_buffer[i:i + max_chunk_size] for i in range(0, len(self.audio_buffer), max_chunk_size)]
# Process chunks in parallel (using asyncio.gather for concurrency)
tasks = [self.transcribe_chunk(chunk, model) for chunk in chunks]
transcriptions = await asyncio.gather(*tasks)
when the chunk is not corrupted, I get "Audio format Server Side: ['webm']" but when it is corrupted, I get "Audio format Server Side: None"