I'm capturing system audio using ScreenCaptureKit, and combining it with microphone audio to produce a single stream.
However, I'm having trouble synchronizing the two streams.
Here's what my AVAudioEngine graph looks like:
muted
inputNode --> converterNode --> mixerNode --> mainMixerNode --> outputNode
| |
playerNode ---------------------> + +--> tap --> file
When ScreenCaptureKit outputs an audio buffer, I convert it to AVAudioPCMBuffer and feed that to my AVPlayerNode:
// SCStreamOutput
func stream(_ stream: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of outputType: SCStreamOutputType) {
let pcmBuffer = convertToPCM(sampleBuffer)
playerNode.scheduleBuffer(pcmBuffer, completionCallbackType: .dataConsumed, completionHandler: nil)
}
The problem is that the two streams easily get out of sync, because each requires a processing step before they arrive at mixerNode (converting buffer format for system audio, and converting sample rate for microphone audio). The delay is noticeable due to the microphone picking up system audio, leading to an echo effect.
How can I synchronize the two audio streams? ScreenCaptureKit has its own synchronizationClock, but it's not clear how this might interface with AVAudioEngine.