IBM Speech to Text Alphanumeric String recognition?

1.1k Views Asked by At

In trying to get Speech to Text (IBM Voice Gateway IVR app) to recognize alpha-numeric character strings, I am wondering if I could create a custom grammar or entity that would restrict STT to recognizing just individual letters and numbers, excluding words altogether. For example, here's a typical string: 20Y0H8C. Watson comes back with words and numbers, like "two" instead of "2". Digit strings work fine. I realize that letter recognition is problematic with typical ASR, but I'm hoping Watson is up to the task. I noticed there are no system entities for alphanumeric characters. Any suggestions are much appreciated.

1

There are 1 best solutions below

2
On

In this case, set smart_formatting to true.

The smart_formatting parameter converts dates, times, series of digits and numbers, phone numbers, currency values, and Internet addresses into more conventional representations in the final transcript of a recognition request. The conversion makes the transcript more readable and enables better post-processing of the transcription results. You set the parameter to true to enable smart formatting, as in the following example; by default, the parameter is false and smart formatting is not performed.

Check:

curl -X POST -u {username}:{password}
--header "Content-Type: audio/flac"
--data-binary @{path}audio-file.flac
"https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?smart_formatting=true"

Result:

Voice: The quantity is one million one hundred and one

Result: The quantity is 1000101

Check IBM Official documentation.

Note: The smart formatting feature is currently beta functionality that is available for US English only.