Audio bytes chunks getting corrupted during streaming using Django and Websockets

27 Views Asked by Mikehade At 29 March 2024 at 11:02

I am trying to implement an audio streaming transcription service using django and websockets. The implementation works but the chunks get corrupted after some time like after tenth or eleventh chunk of transcription, while debugging, I discovered that the webm audio bytes extension was missing (which means some other metadata might also be missing in those chunks which makes me unable to transcribe those chunks.

class TranscriptConsumer(AsyncWebsocketConsumer):
    """
    Server side implementation of the audio streaming service
    """
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.audio_buffer = b''  # Variable to store accumulated audio data

I used the audio_buffer variable to concatenate all audio bytes chunks still in an effort to prevent the corruption of data because before using that variable to concatenate the chunks, I experienced the audio bytes corruption after the first chunk. I use below to receive and concatenate the chunks

async def receive(self, text_data=None, bytes_data=None):
        if bytes_data:
            # Combine incoming bytes_data with the accumulated audio_buffer
            self.audio_buffer += bytes_data

            # Check if the combined data is sufficient for processing
            if len(self.audio_buffer) >= 10000:  
                await self.process_audio()

And below to transcribe the chunks

# Split the combined data into chunks
max_chunk_size = 1024 * 1024  # 1MB chunk size
chunks = [self.audio_buffer[i:i + max_chunk_size] for i in range(0, len(self.audio_buffer), max_chunk_size)]
           

# Process chunks in parallel (using asyncio.gather for concurrency)
tasks = [self.transcribe_chunk(chunk, model) for chunk in chunks]
transcriptions = await asyncio.gather(*tasks)

when the chunk is not corrupted, I get "Audio format Server Side: ['webm']" but when it is corrupted, I get "Audio format Server Side: None"

Original Q&A

Audio bytes chunks getting corrupted during streaming using Django and Websockets

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in DJANGO

Related Questions in WEBSOCKET

Related Questions in DJANGO-CHANNELS

Trending Questions

Popular # Hahtags

Popular Questions