I am currently exploring the capabilities of the Google Text-to-Speech API for a project I'm working on. I've successfully used it to generate audio from text inputs, but I'm wondering if the API has the ability to generate subtitles or captions along with the audio output.
Specifically, I would like to know if there is a feature or method within the Google Text-to-Speech API that allows for the generation of subtitles in formats such as SRT (SubRip Subtitle) or VTT (WebVTT) files, which could then be synchronized with the audio output.
If this functionality is available, I would appreciate any guidance or resources on how to implement it effectively. Additionally, if there are any limitations or considerations to be aware of when using the API for generating subtitles, please share those as well.
Thank you in advance for any insights or assistance you can provide!
I saw AWS Polly voice has such function but couldn't find how Google's TTS export the subtitle.
Whether the official google text-to-speech API can generate the subtitle/caption file along with the voice file? If it does, how to use it?