Python speech to text from mikrofon stream

117 Views Asked by IOSCodingMax At 28 July 2025 at 00:53

I want to programm my own speech assistent with python and run it on a rapsberry pi later on. My first step to do this is to transcribe the speech of a microfon stream. So I want the speech that my microphone receives to be immediately converted to text so that I can then check this text for signal words such as "Hey Siri".

I have already tried most of the STT APIs, such as speech recognition, whisper and Google Cloud Speech_To_Text. I had the problem with all of them that they weren't transcribing during the stream. For example speech recognition waited until I stopped speaking. This recorded audio file was then sent to the servers and transcribed. This took a very long time.

Any ideas?

Original Q&A

There are 1 best solutions below

Kathy Reid On 06 December 2023 at 01:24

The specific problem you are trying to solve here is the real-time transcription of streaming audio. The SpeechRecognition library for Python is capable of doing this, but requires some additional manipulation. See this question for more information.

Python speech to text from mikrofon stream

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in SPEECH-RECOGNITION

Related Questions in WHISPER

Trending Questions

Popular # Hahtags

Popular Questions