I have created a simple android app for controlling a relay connected to my Raspberry Pi. I have used buttons as well as basic voice recognition to trigger those buttons and switch on/off the corresponding relay channel.
As of now the voice recognition part is handled by a RecognizerIntent, wherein I need to press a button on my app to open a Google voice prompt which listens to my voice command and activates/deactivates the corresponding button which controls the relay switches.
I want to do the same with continuous voice recognition which allows the app to continuously listen to my commands without the user having to press a button on the app, hence allowing hands-free operation.
Here is my existing code, a very simple means of voice recognition which will allow me to switch on and off the buttons for various devices connected to the relay:
public void micclick(View view) {
if(view.getId()==R.id.mic)
{promptSpeechInput();}
}
private void promptSpeechInput() {
Intent i= new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
i.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
i.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
i.putExtra(RecognizerIntent.EXTRA_PROMPT,"Speak!");
try{
startActivityForResult(i,100);
}
catch (ActivityNotFoundException a)
{
Toast.makeText(MainActivity.this,"Sorry your device doesn't support",Toast.LENGTH_SHORT).show();
}
}
public void onActivityResult(int requestCode, int resultCode, Intent i) {
super.onActivityResult(requestCode, resultCode, i);
String voicetxt;
switch (requestCode) {
case 100:
if (resultCode == RESULT_OK && i != null) {
ArrayList<String> result2 = i.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
voicetxt = result2.get(0);
if (voicetxt.equals("fan on")) {
StringBuffer result=new StringBuffer();
toggleButton1.setChecked(true);
result.append("Fan: ").append(toggleButton1.getText());
sc.onRelayNumber="a";
new Thread(sc).start();
Toast.makeText(MainActivity.this, result.toString(),Toast.LENGTH_SHORT).show();
}
if (voicetxt.equals("fan of")) {
StringBuffer result=new StringBuffer();
toggleButton1.setChecked(false);
result.append("Fan: ").append(toggleButton1.getText());
sc.onRelayNumber = "a_off";
new Thread(sc).start();
Toast.makeText(MainActivity.this, result.toString(),Toast.LENGTH_SHORT).show();
}
if (voicetxt.equals("light on")) {
StringBuffer result=new StringBuffer();
toggleButton2.setChecked(true);
result.append("Light: ").append(toggleButton2.getText());
sc.onRelayNumber = "b";
new Thread(sc).start();
Toast.makeText(MainActivity.this, result.toString(),Toast.LENGTH_SHORT).show();
}
if (voicetxt.equals("light off")) {
StringBuffer result=new StringBuffer();
toggleButton2.setChecked(false);
result.append("Light: ").append(toggleButton2.getText());
sc.onRelayNumber = "b_off";
new Thread(sc).start();
Toast.makeText(MainActivity.this, result.toString(),Toast.LENGTH_SHORT).show();
}
if (voicetxt.equals("air conditioner on")) {
StringBuffer result=new StringBuffer();
toggleButton3.setChecked(true);
result.append("AC: ").append(toggleButton3.getText());
sc.onRelayNumber = "c";
new Thread(sc).start();
Toast.makeText(MainActivity.this, result.toString(),Toast.LENGTH_SHORT).show();
}
if (voicetxt.equals("air conditioner of")) {
StringBuffer result=new StringBuffer();
toggleButton3.setChecked(false);
result.append("AC: ").append(toggleButton3.getText());
sc.onRelayNumber = "c_off";
new Thread(sc).start();
Toast.makeText(MainActivity.this, result.toString(),Toast.LENGTH_SHORT).show();
}
if (voicetxt.equals("heater on")) {
StringBuffer result=new StringBuffer();
toggleButton4.setChecked(true);
result.append("Heater: ").append(toggleButton4.getText());
sc.onRelayNumber = "d";
new Thread(sc).start();
Toast.makeText(MainActivity.this, result.toString(),Toast.LENGTH_SHORT).show();
}
if (voicetxt.equals("heater off")) {
StringBuffer result=new StringBuffer();
toggleButton4.setChecked(false);
result.append("Heater: ").append(toggleButton4.getText());
sc.onRelayNumber = "d_off";
new Thread(sc).start();
Toast.makeText(MainActivity.this, result.toString(),Toast.LENGTH_SHORT).show();
}
}
break;
}
}
I want to achieve the same functionality without having to press the button. Please note that I am new to Android app development. If possible, please be descriptive in the usage of external libraries, if they are required because I don't think continuous recognition is possible with Google's RecognizerIntent. I have speculated that I might need to include libraries like CMUSphinx, but I am not sure how to go about it.
There are several things you can do for continuous recognition / dictation mode. You can use the google speech recognition from the android itself, it's not recommended for continuous recognition (as stated on https://developer.android.com/reference/android/speech/SpeechRecognizer.html)
But if you really need it, you can do a workaround by creating your own class and inherit IRecognitionListener. (I wrote this on xamarin-android, the syntax is very similar to native android)
To Call it on the activity :
Don't forget to request permission to use microphone!
Explanation :
-This will remove the annoying "click to start recording"
-This will always record the moment you call StartListening() and never stops because I always call startover() or StartListening() everytime it finished recording
-This is a pretty bad workaround since the moment it process your recording, the recorder won't get any sound input until it called StartListening() (There are no workaround for this)
-Google recognition is not really good for voice command since the language model is "[lang] sentences", so you can't limit the word, and google will always try to make a "good sentence" as a result.
For better result and UX, I really suggest you use Google Cloud API (but it must be online, and costly), the second suggestion is CMUSphinx / PocketSphinx, it is open source, can do offline mode, but you have to do all things manually
PocketSphinx advantage :
Offline mode compatible
You can do your own training for accoustic model (phonetic, etc), so you can configure it depends on your environment and pronounciation
PocketSphinx disadvantage : You have to do all things manually, from setting up your accoustic model, dictionary, language model, threshold, etc. (overkill if you want something simple).