Say "hello" when Twilio call connected

269 Views Asked by At

I have a websocket in python Flask that listens to a twilio call. When the call is started I want to say "hello" here is the code.

     if data['event'] == "start":
        
        speakBytes = speaker.speak("Hello") // using micrsoft cognitive service to convert the text to bytes
        convertedBytes = ap.lin2ulaw(speakBytes.audio_data,1)
        ws.send(responseString.format(base64.b64encode(convertedBytes), str(data['streamSid'])))

But the above is not working. I checked microsoft cognitive services speech sunthesizer returns the bytes in WAV format so I have used lin2ulaw form python audioop module.

Need help. Thanks in advance.

3

There are 3 best solutions below

2
On

If you're using Twilio to connect the number then you'll need to reply with TwiML to the call:

from twilio.twiml.voice_response import VoiceResponse
response = VoiceResponse()
response.say('Hello')
return str(response)

See the doc of <Say></Say.

If you want to use the .wav you created then you would need to save it somewhere accessible (e.g. an Amazon S3 bucket) and then you can use TwiML <Play></Play>.

0
On

Twilio developer evangelist here.

It looks like you are correctly creating the audio to send to the Twilio Media Stream, however I don't think you are sending the correct format.

Twilio Media Streams expect a media message to be a JSON object with the following properties:

  • event: the value "media"
  • streamSid: the SID of the stream
  • media: an object with a "payload" property that then contains the base64 encoded mulaw/8000 audio

Something like this might work:

message = {  
  "streamSid": data['streamSid'],  
  "event": "media",  
  "media": {
    "payload": base64.b64encode(convertedBytes)
  }
}  
       
# Serializing json   
json_object = json.dumps(message)

ws.send(json_object)
1
On

Thanks for the answers everyone. The solution turned out to be a small change.

I had to change ap.lin2ulaw(speakBytes.audio_data,1) to ap.lin2ulaw(speakBytes.audio_data,4) and it worked fine. It seems to be the compatibility of microsoft text to speech and twilio formats.