Does anyone know if there is a way to manipulate the recognition of phone numbers when using the Google Speech API? I am trying to implement a transcription scenario where a caller will say a string of letters and numbers, but the logic out of the box seems to be to try to fit any sequence of numbers to a phone number scheme, even if it means rendering letters into numbers they may sound vaguely similar to (or not). I have tried using speech contexts to manipulate the values within the "phone number" by typing out and giving the entire thing as it should be as a speech context ("eight seven seven two bee three seven", for example), but it refuses to override the digits being interpreted as a phone number. Has anyone encountered this issue or is aware of any way in which this could be worked around?
Thanks!
I'm not aware of an easy way to do this. For the Web Speech API for JavaScript, doing the following seems to yield fewer results that are forced into phone number format.:
Set the
maxAlternatives = 2
, e.g.,Then use the second result offered, e.g.,
You can get pretty far by processing the result. A remaining challenge is that since the result often clumps digits together, you lose the distinction between a series of single digit numbers and one multi-digit number (e.g., '15' & '1', '5'). The utility of this approach depends on the specifics of the numbers your app is trying to capture.