I am building an application that needs to make outbound calls to customers. I need these calls recorded and transcribed.
I have been getting very confused on how to achieve this, but this is what I have so far in a NextJS API endpoint I have built to trigger a call:
import twilio from 'twilio'
import { NextResponse } from 'next/server'
const accountSid = process.env.TWILIO_ACCOUNT_SID
const authToken = process.env.TWILIO_AUTH_TOKEN
const twilioNumber = process.env.TWILIO_PHONE_NUMBER
interface OutBoundCall {
to: string
}
export const POST = async (request: Request) => {
const { to }: OutBoundCall = await request.json()
if (!to) return NextResponse.json({ message: 'No to number provided' })
try {
const client = twilio(accountSid, authToken)
client.calls
.create({
to: 'XXXXX', // <-- This is just my personal phone number for now whilst testing
from: twilioNumber as string,
twiml: `<Response><Dial><Number>${to}</Number></Dial></Response>`,
record: true
})
.then(call => console.log(call.sid))
return NextResponse.json({ message: `Call has been triggered to ${to}` })
} catch (error) {
console.log({ error })
return NextResponse.json(error)
}
}
This is currently calling my personal phone and then dialling the number I have passed in once it has connected to me.
I am finding if I add the <Record transcribe="true" />
in the TwiML, it never actually records or transcribes. Adding the record: true
to the client/calls.create
method does work, but it does not accept a transcribe
parameter.
How can I achieve this?
I have a feeling I am doing this wrong anyway, as in my call logs, it appears as 2 calls (one to me, and then one to the number I actually want to call), so I believe I'm being double charged.
If I am doing this wrong, how should I go about this?
What I actually want to happen in an ideal situation:
- A user clicks a "Call" button in the frontend
- A POST request is sent to either an endpoint of mine, or a Twilio one
- The number I want to speak to starts ringing through the users machine and just the one call is triggered direct from my frontend to the user
I did try using the Device
component in my react frontend, but I could never get it working.
This is the code I was trying:
import { Device } from '@twilio/voice-sdk'
const makeCall = async () => {
try {
const device = new Device(
'my-jwt-here' // (I can succesfully make JWTs)
)
const call = await device.connect({
params: {
To: 'XXXX', // The phone number I want to call
}
})
console.log({ call })
} catch (error) {
console.log({ error })
}
}
export const MakeACallPage = () => {
return <button onClick={makeCall}>Make a call here</button>
}
All the docs I can find about making outbound calls all seem to be a simple version where the call simply speaks pre-defined words to the user and does not show how to make calls where 2 people can talk to one another.
What am I doing wrong?
Many thanks for any help
Unfortunately it seems Twilio doesn't support transcription of entire calls.
The
<Record>
verb can only record the caller. The verb doesn't support nesting so you can't have any other verbs running at the same time.<Record>
is basically only useful for recording voicemail.If you really want to transcribe entire calls, it seems you'll have to use another API to transcribe the recordings. OpenAI is one option, among many.
As for your issue with the Voice JavaScript SDK, I'd suggest creating a separate question for that. I don't have the experience to help you with that one, I'm afraid.