How can we give the input file from storage container to azure speech api using python

983 Views Asked by At

Below is the code,

call_name1="test.wav"
blob_client1=blob_service_client.get_blob_client("bucket/audio",call_name1)
print(blob_client1)

streamdownloader=blob_client1.download_blob()
stream = BytesIO()
streamfinal=streamdownloader.download_to_stream(stream)
print(streamfinal)

speech_key, service_region = "12345", "eastus"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

audio_input = speechsdk.audio.AudioConfig(filename=streamfinal)

Error,

TypeError                                 Traceback (most recent call last)
<ipython-input-6-a402ae91606a> in <module>
     44 speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
     45
---> 46 audio_input = speechsdk.audio.AudioConfig(filename=streamfinal)

C:\ProgramData\Anaconda3\lib\site-packages\azure\cognitiveservices\speech\audio.py in __init__(self, use_default_microphone, filename, stream, device_name)
    213
    214         if filename is not None:
--> 215             self._impl = impl.AudioConfig._from_wav_file_input(filename)
    216             return
    217         if stream is not None:

TypeError: in method 'AudioConfig__from_wav_file_input', argument 1 of type 'std::string const &'

Please help us in reading the audio files from storage container as input in Azure speech api. Thank you!!

1

There are 1 best solutions below

1
unknown On BEST ANSWER

As ewong said in the comment, You need to get the stream instead of String.

download_to_stream is used to download the contents of this blob to a stream. But not azure.cognitiveservices.speech.audio.AudioInputStream what AudioConfig need.

I cannot find the workaround about converting stream to AudioInputStream. So, It seems only the way that downloads the audio file to the local from Storage Blob and then uploads it by AudioConfig.

from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
import azure.cognitiveservices.speech as speechsdk

filename = "test.txt"
container_name="test-container"

blob_service_client = BlobServiceClient.from_connection_string("DefaultEndpointsProtocol=https;AccountName=pamelastorage;AccountKey=UOyhItMnWJmB54Jmj8U0YtStNFk0vZyN1+nRem9+JwqNVJEMh5deerdfLbhVQl0ztmg96UZEUtRh2HVp8+ZJWA==;EndpointSuffix=core.windows.net")
container_client=blob_service_client.get_container_client(container_name)
blob_client = container_client.get_blob_client(filename)

with open(filename, "wb") as f:
    data = blob_client.download_blob()
    data.readinto(f)

audio_input = speechsdk.audio.AudioConfig(filename=filename)
print(audio_input)