I have a Python program that uses the OpenAI Whisper module to transcribe audio to text. Unfortunately, despite passing a fully resolved path to that module, it crashes with an error saying it can't find the file. I know the file exists in the directory because as you can see from my code and output below, the script itself can find it (look at the code and output where I print the input file's timestamp). I am running on a Windows 10 PC.
Why can't the imported module find the input file and how can I fix this problem? I read several posts on SO regarding paths and the subprocess module, but none of the solutions worked for me.
Here is the code:
import whisper
import pandas as pd
import os
import sys
from datetime import datetime
# Show the current working directory
cwd = os.getcwd()
print ("Current working directory: {0}\n".format(cwd))
# Transcript a previously downloaded audio file.
# audio_file = "./audio.mp4"
# with open(os.path.join(sys.path[0], "audio.mp4"), "r") as f:
audio_file = os.path.join(cwd, "audio.mp4")
print ("Using audio input file: {0}\n".format(audio_file))
# Get the timestamp for the file
timestamp = os.path.getmtime(audio_file)
# Convert the timestamp to a datetime object
dt = datetime.fromtimestamp(timestamp)
# Format the datetime object in the desired format
formatted_timestamp = dt.strftime("%m/%d/%Y")
# Print the formatted timestamp
print("Input file timestamp: {0}\n\n".format(formatted_timestamp))
#Load the OpenAI Whisper model
whisper_model = whisper.load_model("tiny")
# Transcribe the audio.
transcription = whisper_model.transcribe(audio_file)
# Display the transcription. This will display
# the transcription result in segments with
# start and end time. The full concatenated
# string is available as transcription['text']
# print as DataFrame
df = pd.DataFrame(transcription['segments'], columns=['start', 'end', 'text'])
print(df)
# or, print as String
print(transcription['text'])
Here is the program output:
C:\Users\main\Documents\GitHub\ME\open-ai\whisper\python-utilities>python transcribe-audio.py
Current working directory: C:\Users\main\Documents\GitHub\ME\open-ai\whisper\python-utilities
Using audio input file: C:\Users\main\Documents\GitHub\ME\open-ai\whisper\python-utilities\audio.mp4
Input file timestamp: 12/27/2022
C:\Python310\lib\site-packages\whisper\transcribe.py:78: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Traceback (most recent call last):
File "C:\Users\main\Documents\GitHub\ME\open-ai\whisper\python-utilities\transcribe-audio.py", line 36, in <module>
transcription = whisper_model.transcribe(audio_file)
File "C:\Python310\lib\site-packages\whisper\transcribe.py", line 84, in transcribe
mel = log_mel_spectrogram(audio)
File "C:\Python310\lib\site-packages\whisper\audio.py", line 111, in log_mel_spectrogram
audio = load_audio(audio)
File "C:\Python310\lib\site-packages\whisper\audio.py", line 42, in load_audio
ffmpeg.input(file, threads=0)
File "C:\Python310\lib\site-packages\ffmpeg\_run.py", line 313, in run
process = run_async(
File "C:\Python310\lib\site-packages\ffmpeg\_run.py", line 284, in run_async
return subprocess.Popen(
File "C:\Python310\lib\subprocess.py", line 966, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Python310\lib\subprocess.py", line 1435, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
I found the problem.
ffmpegwas not in the script directory and was not in the systempathvariable. Unfortunately the error message produced does not indicate that is the real problem, leading one to believe that it is the input audio file that can not be found, instead the actual problem being thatffmpeg.exethat can not be found. I copiedffmpeg.exeinto the script directory and it worked fine.