I know that there are a lot of resources online explaining how to deinterleave PCM data. In the course of my current project I have looked at most of them...but I have no background in audio processing and I have had a very hard time finding a detailed explanation of how exactly this common form of audio is stored.
I do understand that my audio will have two channels and thus the samples will be stored in the format [left][right][left][right]... What I don't understand is what exactly this means. I have also read that each sample is stored in the format [left MSB][left LSB][right MSB][right LSB]. Does this mean the each 16 bit integer actually encodes two 8 bit frames, or is each 16 bit integer its own frame destined for either the left or right channel?
Thank you everyone. Any help is appreciated.
Edit: If you choose to give examples please refer to the following.
Method Context
Specifically what I have to do is convert an interleaved short[] to two float[]'s each representing the left or right channel. I will be implementing this in Java.
public static float[][] deinterleaveAudioData(short[] interleavedData) {
//initialize the channel arrays
float[] left = new float[interleavedData.length / 2];
float[] right = new float[interleavedData.length / 2];
//iterate through the buffer
for (int i = 0; i < interleavedData.length; i++) {
//THIS IS WHERE I DON'T KNOW WHAT TO DO
}
//return the separated left and right channels
return new float[][]{left, right};
}
My Current Implementation
I have tried playing the audio that results from this. It's very close, close enough that you could understand the words of a song, but is still clearly not the correct method.
public static float[][] deinterleaveAudioData(short[] interleavedData) {
//initialize the channel arrays
float[] left = new float[interleavedData.length / 2];
float[] right = new float[interleavedData.length / 2];
//iterate through the buffer
for (int i = 0; i < left.length; i++) {
left[i] = (float) interleavedData[2 * i];
right[i] = (float) interleavedData[2 * i + 1];
}
//return the separated left and right channels
return new float[][]{left, right};
}
Format
If anyone would like more information about the format of the audio the following is everything I have.
- Format is PCM 2 channel interleaved big endian linear int16
- Sample rate is 44100
- Number of shorts per short[] buffer is 2048
- Number of frames per short[] buffer is 1024
- Frames per packet is 1
Actually your are dealing with an almost typical WAVE file at Audio CD quality, that is to say :
I said almost because big-endianness is usually used in AIFF files (Mac world), not in WAVE files (PC world). And I don't know without searching how to deal with endianness in Java, so I will leave this part to you.
About how the samples are stored is quite simple:
Then to feed an audio callback, it is usually required to provide 32-bits floating point, ranging from -1 to +1. And maybe this is where something may be missing in your aglorithm. Dividing your integers by 32768 (2^(16-1)) should make it sound as expected.