Encode microphone data and decode it to feed audio codec leads to white noise

35 Views Asked by At

I want to encode a stream of audio from microphone using Opus and then decode it and play for audio output in a stream mode. The parameters I use: 8000 samples per sec of microphone mono audio must be coded in 1200 bps and the decoded to the same sample rate for audio output. The full source code can be found at the public GitHub repo https://github.com/Dmitry-Nikishov/RdmWalkieTalkie. Here are some comments and the actual problem I see.

Here is my audio format applied:

var audioFormat = new AudioFormat(OpusSettings.OPUS_SAMPLE_RATE/*8000*/,
        16,
        1,
        true,
        false);

To encode PCM data I have the following class:

package audio.opus;

import com.sun.jna.ptr.PointerByReference;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import tomp2p.opuswrapper.Opus;

import java.nio.Buffer;
import java.nio.ByteBuffer;
import java.nio.IntBuffer;
import java.nio.ShortBuffer;

public class OpusEncoder {
    private static Logger logger = LogManager.getRootLogger();
    private PointerByReference opusEncoder;

    public OpusEncoder() {
        IntBuffer error = IntBuffer.allocate(1);
        opusEncoder = Opus.INSTANCE.opus_encoder_create(
                OpusSettings.OPUS_SAMPLE_RATE,
                OpusSettings.OPUS_CHANNEL_COUNT,
                Opus.OPUS_APPLICATION_AUDIO,
                error
        );
        if (error.get() != Opus.OPUS_OK && opusEncoder == null) {
            logger.debug(
                    String.format("Received error status from opus_encoder_create(...): {%s}", error.get())
            );
        }

        Opus.INSTANCE.opus_encoder_ctl(opusEncoder, Opus.OPUS_SET_BITRATE_REQUEST, OpusSettings.OPUS_BITRATE_IN_BPS);
        Opus.INSTANCE.opus_encoder_ctl(opusEncoder, Opus.OPUS_SET_VBR_REQUEST, 0);
    }

    public ByteBuffer encodeToOpus(ByteBuffer rawAudio)
    {
        ShortBuffer nonEncodedBuffer = ShortBuffer.allocate(rawAudio.remaining() / 2);
        ByteBuffer encoded = ByteBuffer.allocate(4096);
        for (int i = rawAudio.position(); i < rawAudio.limit(); i += 2)
        {
            int firstByte =  (0x000000FF & rawAudio.get(i));      //Promotes to int and handles the fact that it was unsigned.
            int secondByte = (0x000000FF & rawAudio.get(i + 1));

            //Combines the 2 bytes into a short. Opus deals with unsigned shorts, not bytes.
            short toShort = (short) ((firstByte << 8) | secondByte);

            nonEncodedBuffer.put(toShort);
        }
        ((Buffer) nonEncodedBuffer).flip();

        int result = Opus.INSTANCE.opus_encode(
                opusEncoder,
                nonEncodedBuffer,
                OpusSettings.OPUS_FRAME_SIZE_IN_SAMPLES,
                encoded,
                encoded.capacity()
        );

        if (result <= 0)
        {
            logger.debug(String.format("Received error code from opus_encode(...): {%d}", result));
            return null;
        }

        ((Buffer) encoded).position(0).limit(result);
        return encoded;
    }

    public void shutdown() {
        if (opusEncoder != null) {
            Opus.INSTANCE.opus_encoder_destroy(opusEncoder);
            opusEncoder = null;
        }
    }
}

For each PCM audio data block, I execute the following code (each 20 ms frame is converted to byte[] and then pushed to a circular buffer for another thread to read, decode and write to an audio output) :

private void audioInputCb( byte[] audioData )
{
    final int numOf20msFrames = audioData.length/OpusSettings.OPUS_FRAME_SIZE_IN_BYTES;

    for (int frameIdx = 0; frameIdx < numOf20msFrames; frameIdx++) {
        final var frameOffset = frameIdx*OpusSettings.OPUS_FRAME_SIZE_IN_BYTES;
        final var frameToBeEncoded = ByteBuffer.wrap(audioData, frameOffset, OpusSettings.OPUS_FRAME_SIZE_IN_BYTES);
        final var encodedFrame = encoder.encodeToOpus(frameToBeEncoded);
        final var encodedArray = Arrays.copyOf(encodedFrame.array(), encodedFrame.limit());
        m_audioBuffer.push(Arrays.asList(ArrayUtils.toObject(encodedArray)));
    }
}

Here is a Decoder class I use:

package audio.opus;

import com.sun.jna.ptr.PointerByReference;
import tomp2p.opuswrapper.Opus;

import java.nio.ByteBuffer;
import java.nio.IntBuffer;
import java.nio.ShortBuffer;

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

public class OpusDecoder {
    private static Logger logger = LogManager.getRootLogger();
    private PointerByReference opusDecoder;

    public OpusDecoder() {
        IntBuffer error = IntBuffer.allocate(1);
        opusDecoder = Opus.INSTANCE.opus_decoder_create(
                OpusSettings.OPUS_SAMPLE_RATE,
                OpusSettings.OPUS_CHANNEL_COUNT,
                error);

        if (error.get() != Opus.OPUS_OK && opusDecoder == null) {
            logger.debug(
                    String.format("Received error code from opus_decoder_create(): %s", error.get())
            );
        }
    }

    public short[] decodeFromOpus(ByteBuffer encodedAudio) {
        int length = encodedAudio.remaining();
        int offset = encodedAudio.arrayOffset() + encodedAudio.position();
        byte[] buf = new byte[length];
        byte[] data = encodedAudio.array();
        System.arraycopy(data, offset, buf, 0, length);

        int result;
        ShortBuffer decoded = ShortBuffer.allocate(4096);

        result = Opus.INSTANCE.opus_decode(
                opusDecoder,
                buf,
                buf.length,
                decoded,
                OpusSettings.OPUS_FRAME_SIZE_IN_SAMPLES,
                0
        );

        if (result < 0) {
            logger.debug(String.format("Opus decode -> result < 0 : %d", result));
            return null;
        }

        short[] audio = new short[result];
        decoded.get(audio);
        return audio;
    }

    public void shutdown() {
        if (opusDecoder != null) {
            Opus.INSTANCE.opus_decoder_destroy(opusDecoder);
            opusDecoder = null;
        }
    }
}

The following runnable is executed to play this audio stream (I extract as many frames as I can, decode it and feed to audio output):

Runnable task = () -> {
    while (m_taskRunFlag.get()) {
        if (m_audioBuffer.getSize() >= OpusSettings.OPUS_ENCODED_1_SEC_FRAME_LENGTH_IN_BYTES) {
            final var numOfFrames = m_audioBuffer.getSize()/OpusSettings.OPUS_ENCODED_20_MS_FRAME_LENGTH_IN_BYTES;
            for (int frameId = 0; frameId < numOfFrames; frameId++) {
                final var audioFrame = ArrayUtils.toPrimitive(
                        m_audioBuffer.pop(OpusSettings.OPUS_ENCODED_20_MS_FRAME_LENGTH_IN_BYTES).toArray(Byte[]::new)
                );
                final var decodedFrame = opusDecoder.decodeFromOpus(ByteBuffer.wrap(audioFrame));
                writeDataToAudioOutput(shortToByteArray(decodedFrame));
            }
        } else {
            try {
                Thread.sleep(1_000);
            }catch (Exception e) {
                e.printStackTrace();
            }
        }
    }
};

As a result I get only the white noise. Does anyone have an idea of what could be wrong here? The way I handle audio devices is 100% correct - if I get rid of Opus stuff, I hear my voice clearly.

1

There are 1 best solutions below

0
Dmitry On

Ok, to whom it May concern : the problem is related to byte ordering, after byte swapping short items of opus_decode(...) with the help of Short.reverseBytes I got the right sound !