google-speech-api and overriding phone number recognition

612 Views Asked by At

Does anyone know if there is a way to manipulate the recognition of phone numbers when using the Google Speech API? I am trying to implement a transcription scenario where a caller will say a string of letters and numbers, but the logic out of the box seems to be to try to fit any sequence of numbers to a phone number scheme, even if it means rendering letters into numbers they may sound vaguely similar to (or not). I have tried using speech contexts to manipulate the values within the "phone number" by typing out and giving the entire thing as it should be as a speech context ("eight seven seven two bee three seven", for example), but it refuses to override the digits being interpreted as a phone number. Has anyone encountered this issue or is aware of any way in which this could be worked around?

Thanks!

2

There are 2 best solutions below

0
On

I'm not aware of an easy way to do this. For the Web Speech API for JavaScript, doing the following seems to yield fewer results that are forced into phone number format.:

Set the maxAlternatives = 2, e.g.,

var recognition = new speechRecognition();

recognition.maxAlternatives = 2;

Then use the second result offered, e.g.,

constr speechToText = event.results[0][1].transcript

You can get pretty far by processing the result. A remaining challenge is that since the result often clumps digits together, you lose the distinction between a series of single digit numbers and one multi-digit number (e.g., '15' & '1', '5'). The utility of this approach depends on the specifics of the numbers your app is trying to capture.

0
On

In at least one case, setting the language to en-PH (English Philippines) seems to have fixed, or at least notably improved, this problem. Other English language options might work as well.

en-GB comes back as a UK formatted number where they put one digit first then the rest of the number.