What I'm trying to do is get real time transcription for video recorded in the browser with webRTC. Use case is basically subtitles in real time like google hangouts has.
So I have a WebRTC program running in the browser. It sends webm objects back to the server. They are linear32 audio encodings. Google speech to text only accepts linear16 or Flac files.
Is there a way to convert linear32 to linear16 in real time?
Otherwise has anyone been able to hook up webRTC with Google speech to get real time transcriptions working?
Any advice on where to look to solve this problem would be great
I had the same problem and failed with webRTC. I recommend you use the Web Audio Api instead if you are just interested in transcribing the audio from the video.
Here is how I did it with a nodejs sever and react client app. It is uploaded to github here
recorderWorkletProcessor.js (saved in
public/src/worklets/recorderWorkletProcessor.js
)Install Socket.io-client on front end
React component code
server.js