I've been using MDN Speech APIs (using things like window.speechSynthesis
and new SpeechSynthesisUtterance
). Firstly, I thought this API was the same as Google's Speech API, but I guess it is not?
Secondly, it's been working great, but there are some challenges I've been facing:
- only available in Chrome and Safari
- voices do seem robotic sometimes
- transcribed texts are awkward with numerals
Firstly, I'm curious if Google's Speech API solves these issues. Whas your experience been with Google's Speech API?
Does the API work across any device? I want to build a game in Unity with Speech. Can I use it within Unity so long as I'm querying the API?
Is it better at recognizing contexts?
- spoken
si yo hablo espanol
-> Does not capturesi
- spoken
me llamo Dan
-> does not capture proper nounDan
- spoken
Does it have less robotic voices?
Thank you!
Note: I am familiar with the use cases for both APIs, I'm looking more for the performance in each of the areas listed above.