So, I'm trying to set up a thing to stream live audio from a mic collected on the webpage to speakers connected to the machine hosting the webpage. For this, I'm using JavaScript's MediaRecorder like so:
navigator.mediaDevices.getUserMedia({ audio: {sampleRate:44100} }).then((stream) => {
sockets[speaker_name][stream_name] = stream;
if (turnOn) {
if (!MediaRecorder.isTypeSupported('audio/webm'))
return alert('Browser not supported');
const mediaRecorder = new MediaRecorder(stream, {
mimeType: 'audio/webm',
})
var socket = new WebSocket('ws://127.0.0.1:8000/micstream/');
sockets[speaker_name][sname] = socket;
sockets[speaker_name][mic] = mediaRecorder;
socket.onopen = () => {
//z = {"sampleRate": 44100}
//stream.getTracks()[0].applyConstraints(z);
console.log(stream.getTracks()[0].getConstraints());
console.log({ event: 'onopen' });
mediaRecorder.addEventListener('dataavailable', async (event) => {
console.log("sending");
if (event.data.size > 0 && socket.readyState == 1) {
socket.send(event.data);
}
})
mediaRecorder.start(250);
}
This collects audio data from the mic, and sends it via websocket to the backend, which is a FastAPI websocket. What I've tried so far is just writing it to the speaker using PyAudio, but all I seem to get is short little bursts of static coming out of the speaker. I've made the sampleRate the same on both sides, but I imagine there's some nuance of audio streaming I'm not understanding that's causing it.
Other than this being static, I'm also wondering about the short bursts.. I was hoping that sending it every 250ms would be more "continuous" and not broken up into bursts, though I figure that might be an easier problem to solve if the audio was actually sounding correct first. Maybe PyAudio isn't set up to handle audio/webm or something?
Here is the PyAudio code I've been using, though I'm open to switching to some other library as long as I can choose what speaker to use somehow:
RATE = 44100
FORMAT = pyaudio.get_format_from_width(1)
CHANNELS = 1
FRAMES_PER_BUFFER = 250
p = pyaudio.PyAudio()
p.open(rate=RATE, format=FORMAT, channels=CHANNELS, input_device_index=speaker, output=True)
The audio payload datastream you push to your websocket is formatted as webm / Matroska and is probably compressed with the opus codec. But PyAudio consumes simple audio samples (sometimes known as uncompressed pulse-code-modulated PCM samples). Hence the bursts of static from your speaker ... the data doesn't look like audio.
And, the compressed audio produces fewer bytes per second than PCM audio would, so your player runs dry. Hence the short bursts.
To play that data stream from a python program you'll need to unbox the audio tracks from webm and decode them from opus. Or you can use the web audio API in place of MediaRecorder, and push the PCM samples through your websocket.