how to get the details of ASR VOSK

333 Views Asked by miladjurablu At 28 June 2025 at 07:43

I have working with Vosk and I need to get the time of each word in my file.mp3 this is my code

def voice_recognition(filename):
    model = Model(model_name="vosk-model-fa-0.5")
    rec = KaldiRecognizer(model, FRAME_RATE)
    rec.SetWords(True)

    mp3 = AudioSegment.from_mp3(filename)
    mp3 = mp3.set_channels(CHANNELS)
    mp3 = mp3.set_frame_rate(FRAME_RATE)

    step = 45000
    transcript = ""
    for i in range(0, len(mp3), step):
        segment = mp3[i:i+step]
        rec.AcceptWaveform(segment.raw_data)
        result = rec.Result()
        text = json.loads(result)["text"]
        transcript += text
    return transcript

I need something like this

time               word
-----------------------
(0.0.01, 0.0.2)    hi
(0.0.03, 0.0.4)    how
(0.0.04, 0.0.5)    are
(0.0.05, 0.0.6)    you

is there any way get the data like this?

Original Q&A

There are 1 best solutions below

miladjurablu On 16 November 2022 at 07:09

I just found all I need are already there when you set the rec.SetWords(True) all the details are in result = rec.Result()

how to get the details of ASR VOSK

There are 1 best solutions below

Related Questions in TENSORFLOW

Related Questions in NLP

Related Questions in SPEECH-RECOGNITION

Related Questions in VOSK

Trending Questions

Popular # Hahtags

Popular Questions