Google Speech API faster with higher sample rate

616 Views Asked by elcombato At 16 March 2017 at 15:13

I'm using the Google Cloud Speech API Python Library to extract text from a video file. In a prior step the video file is converted to a flac audiofile.

sample_rate = 48000 
client = speech.Client()
cmd = "ffmpeg -i {} -vn -ac 1 -ar {} {}".format(mpg_file, sample_rate, flac_file)
subprocess.run(cmd)
with open(flac_file, 'rb') as f:
    audio = client.sample(f.read(), sample_rate=sample_rate, encoding='FLAC')
audio.sync_recognize()

In order to reduce the time taken by the function sync_recognize(), I set sample_rate = 16000. My idea was that the communication with the Web-API and the processing of the audio file should be faster, because the file size is smaller, the amount of data to process is less and the information density is lower.

A repeated runtime measurement of this process with the same list of files for a sample rate of 16kHz and 48kHz yields:

16kHz: 26.16s per call
48kHz: 17.68s per call

I expected the opposite result. Is my thinking wrong? Do you have an explanation for this?

Original Q&A

Google Speech API faster with higher sample rate

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in SPEECH-TO-TEXT

Related Questions in GOOGLE-SPEECH-API

Related Questions in GOOGLE-CLOUD-SPEECH

Related Questions in GCLOUD-PYTHON

Trending Questions

Popular # Hahtags

Popular Questions