I am using Twilio Programmable video, and trying to pipe remote participant's audio in real time to Google Cloud Media Translation client.
There is a sample code on how to use Google Cloud Media Translation client via microphone on here.
What I am trying to accomplish is that instead of using a microphone and node-record-lpcm16
, I want to pipe what I am getting from Twilio's AudioTrack
to Google Cloud Media Translation client. According to
this doc,
Tracks represent the individual audio, data, and video media streams that are shared within a Room.
Also, according to this doc, AudioTrack
contains an audio MediaStreamTrack
. I am guessing this can be used to extract the audio and pipe it to somewhere else.
What's the best way of tackling this problem?
Twilio developer evangelist here.
With the
MediaStreamTrack
you can compose it back into aMediaStream
object and then pass it to aMediaRecorder
. When you start theMediaRecorder
it will receivedataavailable
events which will be a chunk of audio in the webm format. You can then pipe those chunks elsewhere to do the translation. I wrote a blog post on recording using theMediaRecorder
, which should give you a better idea how theMediaRecorder
works, but you will have to complete the work to stream the audio chunks to the server to be translated.