How to process audio from the mic and play it back in real-time?

80 Views Asked by At

I'm working on an iOS app using the AVFoundation framework for real-time audio processing. The app captures audio from the microphone, takes each buffer, passes it to a function that returns a modified buffer and then plays it back.

However, I'm facing a significant delay - around 500 ms - between speaking into the microphone and hearing the playback. At first, I thought that the delay was caused because processing takes so much time, but the same thing happens, even if I remove the processing and just play the original buffer.

Here's my setup:

import SwiftUI
import AVFoundation

class MicTestAudioKitService: MicTestRepository {
    private let engine = AVAudioEngine()
    private let playerNode = AVAudioPlayerNode()
    
    // ------------------
    
    @Injected private var pitchCorrectionService: PitchCorrectionRepository
    
    // ------------------
    
    private var isInitialized = false
    private var pitchCorrectionIntensity: Float = 0.5
    
    // ------------------
    
    private func initialize() {
        guard !isInitialized else { return }
        isInitialized.toggle()
        
        let inputNode = engine.inputNode
        let format = inputNode.inputFormat(forBus: 0)
        
        // Attach and connect the playerNode
        engine.attach(playerNode)
        engine.connect(playerNode, to: engine.mainMixerNode, format: format)
    }
    
    func start() {
        initialize()
        
        let inputNode = engine.inputNode
        let format = inputNode.inputFormat(forBus: 0)
        
        inputNode.installTap(onBus: 0, bufferSize: 1024, format: format) { [weak self] buffer, _ in
            guard let self else { return }

            let processedBuffer = pitchCorrectionService.pitchCorrect(buffer: buffer, intensity: pitchCorrectionIntensity) ?? buffer
            outputAudioBuffer(processedBuffer)
        }
        
        do {
            try engine.start()
        } catch {
            print("Audio Engine failed to start: \(error)")
        }
    }
    
    func stop() {
        engine.stop()
    }
    
    func setPitchCorrectionIntensity(_ intensity: Float) {
        pitchCorrectionIntensity = intensity
    }
}

extension MicTestService {
    private func outputAudioBuffer(_ buffer: AVAudioPCMBuffer) {
        playerNode.scheduleBuffer(buffer, completionHandler: nil)
        
        if !playerNode.isPlaying {
            playerNode.play()
        }
    }
}

Same thing happens, even with wired earphones.

Any ideas?

1

There are 1 best solutions below

3
On

This is what I do for a similar application. It works great. There is latency, but I think it's less than 100 mS.

To monitor the input:

var mixer: AVAudioMixerNode=AVAudioMixerNode()
let audioInputNode=engine.inputNode
let inputFormat = audioInputNode.outputFormat(forBus: 0)
engine.attach(mixer)
engine.connect(mixer, to: engine.outputNode, format: nil)
engine.connect(audioInputNode, to: mixer, format: inputFormat)
engine.prepare()

To record the audio (I use a file, you can use a buffer instead):

let format = audioInputNode.outputFormat(forBus: 0)
let documentURL = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0]
var file: AVAudioFile?
do {
    file = try AVAudioFile(forWriting: documentURL.appendingPathComponent(recordingFileName), settings: format.settings)
} catch _ {
    print("Could not open file for writing")
    handleErrors(theError: "Could not open recording file for writing")
}

audioInputNode.installTap(onBus: 0, bufferSize: 4096, format: format, block: {
    (buffer, time) in
    try? file!.write(from: buffer)
})