How to recognize own audio source to text by EXTRA_AUDIO_SOURCE in RecognizerIntent

26 Views Asked by oak yang At 13 February 2024 at 09:07

** My mobile env is Android 14.

I am implementing android application that recognizing own audio source(PCM) to text.

I referenced RecognizerIntent, got the EXTRA_AUDIO_SOURCE key(API 33) can be used in my case.

Guide doc explain to include ParcelFileDescriptor of audio source, but It does not operate properly. (device cannot recognize my own resource but just recognize my voice through open mic.)

    private var audioPfd : ParcelFileDescriptor? = null
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)
        val testFilePath = copyFiletoStorage(R.raw.test, "test.pcm")
        val testFile = File(testFilePath)
        testFile.setReadable(true)

        audioPfd = ParcelFileDescriptor.open(testFile, ParcelFileDescriptor.MODE_READ_ONLY)
        println("audioPfd size:${audioPfd?.statSize}")
        startSpeechToText(audioPfd)
    }

    private fun copyFiletoStorage(resourceId: Int, resourceName: String): String? {
        val filePath = filesDir.path + "/" + resourceName
        try {
            println("openRawResource")
            val `in` = resources.openRawResource(resourceId)
            var out: FileOutputStream? = null
            out = FileOutputStream(filePath)
            val buff = ByteArray(1024)
            var read = 0
            try {
                while (`in`.read(buff).also { read = it } > 0) {
                    out.write(buff, 0, read)
                }
            } finally {
                `in`.close()
                out.close()
            }
        } catch (e: FileNotFoundException) {
            e.printStackTrace()
        } catch (e: IOException) {
            e.printStackTrace()
        }
        return filePath
    }



    private fun startSpeechToText(pcmFile: ParcelFileDescriptor?) {
        val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)
        intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM)
        intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault().toLanguageTag())
        intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, packageName)
        intent.putExtra(RecognizerIntent.EXTRA_AUDIO_SOURCE_CHANNEL_COUNT, 1)
        intent.putExtra(RecognizerIntent.EXTRA_AUDIO_SOURCE_ENCODING, AudioFormat.ENCODING_PCM_16BIT)
        intent.putExtra(RecognizerIntent.EXTRA_AUDIO_SOURCE_SAMPLING_RATE, 16000)
        intent.putExtra(RecognizerIntent.EXTRA_AUDIO_SOURCE, pcmFile)

        speechRecognizer = SpeechRecognizer.createSpeechRecognizer(this, ComponentName("com.google.android.tts",
            "com.google.android.apps.speech.tts.googletts.service.GoogleTTSRecognitionService"))

        speechRecognizer?.setRecognitionListener(object : RecognitionListener {
            override fun onReadyForSpeech(p0: Bundle?) {
                println("onReadyForSpeech")
            }

            override fun onBeginningOfSpeech() {
                println("onBeginningOfSpeech")
            }

            override fun onRmsChanged(p0: Float) {
                println("onRmsChanged sound: $p0")
            }

            override fun onBufferReceived(p0: ByteArray?) {
                println("onBufferReceived")
            }

            override fun onEndOfSpeech() {
                println("onEndOfSpeech")
            }

            override fun onError(p0: Int) {
                println("onError:$p0")
                stopRecognizer()
            }

            override fun onResults(p0: Bundle?) {
                println("onResults: $p0")
                val matches = p0?.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
                if (matches != null && matches.isNotEmpty()) {
                    val text = matches[0]
                    Toast.makeText(applicationContext, "Recognized Text: $text", Toast.LENGTH_LONG).show()
                    textResult?.setText(text)

                }
            }

            override fun onPartialResults(p0: Bundle?) {
                val matches = p0?.getStringArrayList(SpeechRecognizer.RECOGNITION_PARTS)
                println("onPartialResults:${matches?.get(0)}")
            }

            override fun onEvent(p0: Int, p1: Bundle?) {
                println("onEvent")
            }
        })

        speechRecognizer?.startListening(intent)
    }

I cannot find any examples using EXTRA_AUDIO_SOURCE. Does Android support my requirement? Can I get any guides to implement this function?

Audio source file is placed in res/raw.

Thanks in advance.

** Also I tried to use EXTRA_AUDIO_INJECT_SOURCE (API 31) at Android Emulator ver S. But It doesn't operate neigther.. I put Intent including audio source uri. Is it right way..?

ex) val url = Uri.parse("android.resource://" + getPackageName() + "/" + "raw" + "/" + "test.pcm") val intent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH) intent.putExtra(RecognizerIntent.EXTRA_AUDIO_INJECT_SOURCE, url)

Original Q&A

How to recognize own audio source to text by EXTRA_AUDIO_SOURCE in RecognizerIntent

There are 0 best solutions below

Related Questions in ANDROID

Related Questions in SPEECH-TO-TEXT

Related Questions in RECOGNIZER-INTENT

Trending Questions

Popular # Hahtags

Popular Questions