I'm making an application that helps people by calling the police when they say "Help". However, I'm currently having a problem with the continuous voice recognition.
I tried to use the voice recognizer in Android Studio but i don't know how to make it listen for the trigger word i.e."Help" (here is a tutorial explaining exactly what I want but for some reason it doesn't work for me: https://betterprogramming.pub/implement-continuous-speech-recognition-on-android-1dd2f4b562fd) for reference, here is my current code:
public class MainActivity extends AppCompatActivity {
private TextToSpeech myTTS;
private SpeechRecognizer mySpeechRecognizer;
private AppBarConfiguration appBarConfiguration;
private ActivityMainBinding binding;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
binding = ActivityMainBinding.inflate(getLayoutInflater());
setContentView(binding.getRoot());
setSupportActionBar(binding.toolbar);
NavController navController = Navigation.findNavController(this, R.id.nav_host_fragment_content_main);
appBarConfiguration = new AppBarConfiguration.Builder(navController.getGraph()).build();
NavigationUI.setupActionBarWithNavController(this, navController, appBarConfiguration);
initializeTextToSpeach();
initializeSpeechRecognizer();
}
private void initializeSpeechRecognizer() {
if(SpeechRecognizer.isRecognitionAvailable(this)) {
mySpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(this);
mySpeechRecognizer.setRecognitionListener(new RecognitionListener() {
@Override
public void onReadyForSpeech(Bundle bundle) {
}
@Override
public void onBeginningOfSpeech() {
}
@Override
public void onRmsChanged(float v) {
}
@Override
public void onBufferReceived(byte[] bytes) {
}
@Override
public void onEndOfSpeech() {
}
@Override
public void onError(int i) {
}
@Override
public void onResults(Bundle bundle) {
List<String> results = bundle.getStringArrayList(
SpeechRecognizer.RESULTS_RECOGNITION
);
processResult(results.get(0));
}
@Override
public void onPartialResults(Bundle bundle) {
}
@Override
public void onEvent(int i, Bundle bundle) {
}
});
}
}
private void processResult(String command) {
command = command.toLowerCase();
if(command.indexOf("help") !=-1) {
Uri number= Uri.parse("tel:123");
Intent intent= new Intent(Intent.ACTION_DIAL, number);
startActivity(intent);
}
}
private void initializeTextToSpeach() {
myTTS = new TextToSpeech(this, new TextToSpeech.OnInitListener() {
@Override
public void onInit(int i) {
if(myTTS.getEngines().size() ==0){
Toast.makeText(MainActivity.this, "There is no voice recognizing engine in your device", Toast.LENGTH_LONG).show();
finish();
} else {
myTTS.setLanguage(Locale.US);
speak("Hello, I'm ready");
}
}
});
}
private void speak(String message) {
if (Build.VERSION.SDK_INT >= 21) {
myTTS.speak(message, TextToSpeech.QUEUE_FLUSH, null, null);
}
else {
myTTS.speak(message, TextToSpeech.QUEUE_FLUSH, null);
}
}
@Override
protected void onPause() {
super.onPause();
myTTS.shutdown();
}
}
the first thing I would like to draw your attention to is the presence of permission in the manifest and requesting permission from the system, without this the microphone will not work. Check if the following line of code is present in the manifest:
And update the onCreate method like this:
And check the result of the permission request:
Next, it seems that after registering the listener, you missed a few instructions to start recognition, namely creating an intent. To do this, update your method "initializeSpeechRecognizer" at the end in a similar way:
And the last thing I recommend is to display information about what error occurred in the onError(int i) method, so that in the future you can understand what could go wrong:
I also want to share that once I was working on an application with a continuous-speech-recognition. Google's approach to voice recognition did not suit me due to the limitations of working from services. The vosk-api library helped me here https://github.com/alphacep/vosk-api, which is an api for working with the kaldi voice recognition solution. The essence of this decision is that the application must contain an additional language model of about 50 megabytes in size, and this is only for one language. Of course, this is a big disadvantage of this approach, and the recognition accuracy is worse than that of Google's solution. But in my case it was enough.