How to stream audio of <Dial> to websocket?

47 Views Asked by At

I am trying to transcribe Twilio voice call in real-time with WebSockets. Twilio has multiple examples for this. I am following this one: https://www.twilio.com/en-us/blog/live-transcribing-phone-calls-using-twilio-media-streams-and-google-speech-text

It works as expected. Basically, you call your Twilio number and whatever you speak will get transcribed. Now I want to add a <Dial> flow to it so that when a customer calls, the call will be connected to an agent (Via <Dial>) and the whole conversation will be transcribed.

The problem here is that only the stream of the caller is getting transcribed. The stream of the dialed agent is not being transcribed. I searched and tried quite a few things, but I am not able to get access to the audio stream of the dialed call via WebSocket.

Does anyone know how to do this?

1

There are 1 best solutions below

0
Daniel O. On

Twilio Support Engineer here. In order for your WebSocket server to recieve both the inbound audio track as well as the outbound audio track (the child call), you need to specify the track attribute of the noun. For example:

<?xml version="1.0" encoding="UTF-8"?>
<Response>
   <Start>
     <Stream track="both_tracks" url="wss://mystream.ngrok.io/example" >
      </Stream>
    </Start>
    <Dial>415-123-4567</Dial>
</Response>