I am developing a web app (python) in which I use azure cognitive services speech translation. I used PyWebIo to create an interface and start a server to run my code. I am using 'use_default_microphone' parametre in the audio config to speak to microphone so that it first creates a transcript and translates into target language (with start_continous_recognition method. The app works fine in my local pc. I successfully deployed my app into PyWebio servers and had a permanent web link.

Everything went good until I had a problem once I started 'start recognition' button. The problem seems about my audio input, or at least I guess so.

I see my interface without any problem but when I initiate the recognition it gives me the (SPXERR_AUDIO_SYS_LIBRARY_NOT_FOUND) error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/pywebio/session/threadbased.py", line 254, in run
    callback(event['data'])
  File "/usr/local/lib/python3.8/site-packages/pywebio/output.py", line 801, in click_callback
    return onclick[btn_idx]()
  File "/mnt/app/app.py", line 78, in speech_recognize_continuous_from_file
    speech_recognizer = speechsdk.translation.TranslationRecognizer(
  File "/usr/local/lib/python3.8/site-packages/azure/cognitiveservices/speech/translation.py", line 234, in __init__
    self._impl = self._get_impl(impl.TranslationRecognizer, translation_config, auto_detect_source_language_config, audio_config)
  File "/usr/local/lib/python3.8/site-packages/azure/cognitiveservices/speech/translation.py", line 340, in _get_impl
    return config_type._from_config(translation_config._impl,  audio_config._impl)
RuntimeError: Exception with an error code: 0x38 (SPXERR_AUDIO_SYS_LIBRARY_NOT_FOUND)
[CALL STACK BEGIN]

Here's the (recognition) part of the code.

def speech_recognize_continuous_from_file():
    
    
    """performs continuous speech recognition with input from an audio file"""

    translation_config = speechsdk.translation.SpeechTranslationConfig(
        subscription=speech_key, region=service_region,
        speech_recognition_language='en-US',
        target_languages=('tr',))
    audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)


    speech_recognizer = speechsdk.translation.TranslationRecognizer(
        translation_config=translation_config, audio_config=audio_config)
    

    done = False

    def stop_cb(evt):
        """callback that stops continuous recognition upon receiving an event `evt`"""
        print('CLOSING on {}'.format(evt))
        speech_recognizer.stop_continuous_recognition()
        nonlocal done
        done = True
                    
    transcriptresults = []
    translationresults= []
    def handle_final_result(evt):
        transcriptresults.append(evt.result.text)
        translationresults.append(evt.result.translations['tr'])

    speech_recognizer.recognized.connect(handle_final_result)
    # Connect callbacks to the events fired by the speech recognizer
    speech_recognizer.recognizing.connect(lambda evt: print('RECOGNIZING: {}'.format(evt)))
    speech_recognizer.recognized.connect(lambda evt: print('RECOGNIZED: {}'.format(evt)))
    speech_recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
    speech_recognizer.session_stopped.connect(lambda evt: print('SESSION STOPPED {}'.format(evt)))
    speech_recognizer.canceled.connect(lambda evt: print('CANCELED {}'.format(evt)))
    # stop continuous recognition on either session stopped or canceled events
    speech_recognizer.session_stopped.connect(stop_cb)
    speech_recognizer.canceled.connect(stop_cb)
    
    put_button('Stop Recognition', onclick=speech_recognizer.stop_continuous_recognition, scope='buttonpart', color="danger", outline=False)
    put_button('Entities', onclick=sample_recognize_entities, scope='buttonpart', color="danger", outline=True, small=True)
            
    # Start continuous speech recognition
    speech_recognizer.start_continuous_recognition()
    toast("Recognition Started!", position='right', color='#4eccd9', duration=2)
    while not done:
        time.sleep(1)
                
    put_text(*transcriptresults, sep = " ", scope="transcriptpage")
    put_text(*translationresults, sep = " ", scope="translationpage") 

I do not know what to do. I thought I could use pyaudio to record a file on a cloud and then specify a file path for speech recognition audio config, but I could not find a sample code. There should be a way to make the user of the app to use their own default microphone of the platform the code is executed.

1

There are 1 best solutions below

5
On

The issue is because of some of the packages are missing from the program execution path. This is a general problem occurs often.

Microsoft.CognitiveServices.Speech.core.dll
Microsoft.CognitiveServices.Speech.extension.audio.sys.dll
Microsoft.CognitiveServices.Speech.extension.codec.dll
Microsoft.CognitiveServices.Speech.extension.kws.dll
Microsoft.CognitiveServices.Speech.extension.lu.dll

Copy the above packages from python packages to program executing packages folder.