Is there a more detailed way to debug SFSpeechRecognizer?

452 Views Asked by At

Updated info below, and new code

I am trying to incorporate SFSpeechRecognizer into my app, and the errors/results I am getting from three pre-recorded audiofiles aren't enough for me to figure out what's going on. From the results I am getting I can't figure out what's wrong, and info via Google is sparse.

The code where I loop through three files is at the bottom. Here are the responses I get for my three audio files. I've made sure in each file to speak loudly and clearly, yet I still get: No speech detected or no text returned.

SS-X-03.m4a : There was an error: Optional(Error Domain=kAFAssistantErrorDomain Code=1110 "No speech detected" UserInfo={NSLocalizedDescription=No speech detected})

SS-X-20221125000.m4a : There was an error: Optional(Error Domain=kAFAssistantErrorDomain Code=1110 "No speech detected" UserInfo={NSLocalizedDescription=No speech detected})

SS-X-20221125001.m4a : (there is some text here if I set request.requiresOnDeviceRecognition to false)

My code:

func findAudioFiles(){
        let fm = FileManager.default
        var aFiles : URL
        
        print ("\(urlPath)")
        do {
            let items = try fm.contentsOfDirectory(atPath: documentsPath)

            let filteredInterestArray1 = items.filter({$0.hasSuffix(".m4a")})
            let filteredInterestArray2 = filteredInterestArray1.filter({$0.contains("SS-X-")})
            let sortedItems = filteredInterestArray2.sorted()
            
            for item in sortedItems {
                audioFiles.append(item)
            }

            NotificationCenter.default.post(name: Notification.Name("goAndRead"), object: nil, userInfo: myDic)

        } catch {
            print ("\(error)")
        }
}

@objc func goAndRead(){
    audioIndex += 1
    if audioIndex != audioFiles.count {
    let fileURL = NSURL.fileURL(withPath: documentsPath + "/" + audioFiles[audioIndex], isDirectory: false)
    transcribeAudio(url: fileURL, item: audioFiles[audioIndex])
    }
}

func requestTranscribePermissions() {
    SFSpeechRecognizer.requestAuthorization { [unowned self] authStatus in
        DispatchQueue.main.async {
            if authStatus == .authorized {
                print("Good to go!")
            } else {
                print("Transcription permission was declined.")
            }
        }
    }
}



func transcribeAudio(url: URL, item: String) {
    guard let recognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US")) else {return}
    let request = SFSpeechURLRecognitionRequest(url: url)
    if !recognizer.supportsOnDeviceRecognition { print ("offline not available") ; return   }
     if !recognizer.isAvailable { print ("not available") ; return   }
    request.requiresOnDeviceRecognition = true
    request.shouldReportPartialResults = true

    recognizer.recognitionTask(with: request) {(result, error) in
        guard let result = result else {
            print("\(item) : There was an error: \(error.debugDescription)")
            return
        }

        if result.isFinal {
            print("\(item) : \(result.bestTranscription.formattedString)")
        NotificationCenter.default.post(name: Notification.Name("goAndRead"), object: nil, userInfo: self.myDic)
        }
    }
}

Updated info It appears that I was calling SFSpeechURLRecognitionRequest too often, and before I completed the first request. Perhaps I need to create a new instance of SFSpeechRecognizer? Unsure.

Regardless I quickly/sloppily adjusted the code to only run it once the previous instance returned its results.

The results were much better, except one audio file still came up as no results. Not an error, just no text.

This file is the same as the previous file, in that I took an audio recording and split it in two. So the formats and volumes are the same.

So I still need a better way to debug this, to find out what it going wrong with that file.

0

There are 0 best solutions below