Receive input from microphone from 2 processes at once

480 Views Asked by At

I've been working on the java speech recognition using sphynx4 and I currently have and issue.

I have an app that recognizes the microphone input using LiveSpeechRecognizer class of Sphynx4 which works fine. The issue is that after i added the class that also listens to microphone and transforms and visualizes the output.

Separately both classes works ok. But when combined in a single app i get the error:

LineUnavailableException: line with format PCM_SIGNED 44100.0 Hz, 8 bit, mono, 1 bytes/frame, not supported.

I have checked the issue and it seems to be caused by the simultaneous access to the microphone. I had an idea to use StreamSpeechRecognizer instead of a Live, but I failed to retrieve the stream from the microphone input. Tried AudioInputStream for that purpose.

Could you please suggest how can i adjust my code to get both: SpeechRecognition and Oscilloscope to use microphone simultaneously?

Thanks in advance.

UPD:

That's an my attempt to split the microphone input to use in both apps.

....
     byte[] data = new byte[dataCaptureSize];   
            line.read(data, 0, data.length);

            ByteArrayOutputStream out = new ByteArrayOutputStream();
            out.write(data);
            byte audioData[] = out.toByteArray();
            InputStream byteArrayInputStream = new ByteArrayInputStream(audioData);
            AudioInputStream audioInputStream = new AudioInputStream(byteArrayInputStream,
                    inputFormat,
                    audioData.length / inputFormat.getFrameSize());
....

That's how i convert it to the input stream which is than passed to the StreamSpeechRecognizer and the array of bytes is transformed with Fast Fourier Transform and passed to the graph. That doesn't work as it just freezes the graph all the time so the data displayed is not an actual one.

I tried to run recognition in separate thread but it didn't increase performance at all.

My code of splitting to threads is down below:

Thread recognitionThread = new Thread(new RecognitionThread(configuration,data));
    recognitionThread.join();
    recognitionThread.run();

UPD 2: The input is from microphone. The above AudioInputStream is passed to the StreamSpeechRecognizer:

StreamSpeechRecognizer nRecognizer = new StreamSpeechRecognizer(configuration); nRecognizer.startRecognition(audioStream);

And the byte array is passed transformed by FFT and passed to the graph: ` double[] arr = FastFourierTransform.TransformRealPart(data);

for (int i = 0; i < arr.length; i++) { 
    series1.getData().add(new XYChart.Data<>(i*22050/(arr.length), arr[i]));

`

1

There are 1 best solutions below

14
Phil Freihofner On

Here is a plausible approach to consider.

First, write your own microphone reader. (There are tutorials on how to do this.) Then repackage that data as two parallel Lines that the other applications can read.

Another approach would be to check if either application has some sort of "pass through" capability enabled.

EDIT: added to clarify

This Java sound record utility code example opens a TargetDataLine to the microphone, and stores data from it into an array (lines 69, 70). Instead of storing the data in an array, I'm suggesting that you create two SourceDataLine objects and write the data out to each.

recordBytes = new ByteArrayOutputStream();
secondStreamBytes = new ByteArrayOutputStream();

isRunning = true;

while (isRunning) {
    bytesRead = audioLine.read(buffer, 0, buffer.length);
    recordBytes.write(buffer, 0, bytesRead);
    secondStreamBytes.write(buffer, 0, bytesRead);
}

Hopefully it won't be too difficult to figure out how to configure your two programs to read from the created lines rather than from the microphone's line. I'm unable to provide guidance on how to do that.

EDIT 2: I wish some other people would join in. I'm a little over my head with doing anything fancy with streams. And the code you are giving is so minimal I still don't understand what is happening or how things are connecting.

FWTW: (1) Is the data you are adding into "series1" is the streaming data? If so, can you add a line in that for loop, and write the same data to a stream consumed by the other class? (This would be a way of using the microphone data "in series" as opposed to "in parallel.")

(2) Data streams often involves code that blocks or that runs at varying paces due to the unpredictable way in which the cpu switches between tasks. So if you do write a "splitter" (as I tried to illustrate by modifying the microphone reading code I linked earlier) there could arise a situation where the code will only run as fast as the slower of the two "splits" at the given moment. You may need to incorporate some sort of buffering and use separate threads for the two recipients of the mike data.

I wrote my first buffering code recently, for a situation where a microphone-reading line is sending a stream to an audio-mixing funtion on another thread. I only wrote this a few weeks ago and it's the first time I dealt with trying to run a stream across a thread barrier threads, so I don't know if the idea I came up with is the best way to do this sort of thing. But it does manage to keep the feed from the mike to the mixer steady with no drop outs and no losses.

The mike reader reads a buffer of data, then adds this byte[] buffer into a ConcurrentLinkedQueue<Byte[]>.

From the other thread, the audio-mixing code polls the ConcurrentLinkedQueue for data.

I experimented a bit and currently have the size of the byte[] buffer at 512 bytes and the ConcurrentLinkedQueue is set to hold up to 12 "buffers" before it starts throwing away the oldest buffers (the structure is FIFO). This seems to be enough of these small buffers to accommodate when the microphone processing code temporarily gets ahead of the mixer.

The ConcurrentLinkedQueue has built in provisions to allow adding and polling to occur from two threads at the same time without throwing an exception. Whether this is something you have to write to help with a hand off, and what the best buffer size might be, I can't say. Maybe a much larger buffer with fewer buffers held in the Queue would be better.

Maybe someone else will weigh in, or maybe the suggestion will be worth experimenting with and trying out.

Anyway, that's about the best I can do, given my limited experience with this. I hope you are able to work something out.