I implemented a simple caller and listener script which works in the following way:
- Listener waits for a call
- Caller initiates the call
- Caller plays a wav file (transmits from
pj.AudioMediaPlayer
instance intoplayback_media
) - Listener routes
capture_media
intoplayback_media
- Caller routes
capture_media
intopj.AudioMediaRecorder
, outputting the looped sound into a wav file
Notes:
- Scripts are written in Python and use a default version of pjsua2 (no changes to the sources or options when building)
- Caller and listener are ran on two separate virtual machines
- The reference wav audio is about 30 seconds long, it's a voice recording intended for voice quality testing purposes
playback_media
is obtained aspj.Endpoint.instance().audDevManager().getPlaybackDevmedia()
capture_media
is obtained aspj.Endpoint.instance().audDevManager().getCaptureDevmedia()
The idea of the whole setup, in words, is: the caller plays an audio file and the listener loops back the audio to the caller, who finally stores the looped audio into an audio file. All I want is a simple audio loop, with no enhancements or changes to the original sound.
The issue is that the audio recorded on the caller is of horrible quality: the volume is overall lowered, the first few seconds have even lower volume than the rest, there is a slight echo throughout the recording, it fades in and out for seemingly no reason (echo cancellation?), the fading is quick and results in most of the recording being of such low volume that it is practically inaudible, with only about 4 instances of audible sound appearing with all its abysmal quality.
I find the official pjsip/pjsua2 documentation absolutely useless for anything beyond the simplest example. I've tried disabling VAD and making the script single-threaded and multi-threaded and finally I tried changing the EpConfig
's MediaConfig.quality
variable but this didn't help with anything.
The question is: What can I do to make sure that the original audio is transferred, looped-back and stored as-is, without any changes to the quality or characteristics of the recorded audio? This is a simple, short call which only loops a wav file, nothing more.
I tried changing some basic options available through the Python EpConfig
interface.
I tried switching up the playback and capture media devices.
I've solved the audio quality issue by not using the default media devices.
Instead, I replaced the
playback_device
andcapture_device
on both ends withaudio_device
obtained asNow, the caller can
where
self.recorder
is apj.AudioMediaRecorder
writing to the final output audio file andself.player
is apj.AudioMediaPlayer
playing the reference audio file (this audio is sent to the listener, who loops it back to the caller).Then, the listener can
to loop the audio back to the caller.
The final output audio isn't completely perfect regardless of the "quality" setting or VAD, but it is good enough for my purpose. There is only a slight glitch at the start of the recording and the rest is of proper volume:
It seems that the pjsip system tries to reduce call latency by changing the speed of the playback, and the filters introduce some changes to the transmitted audio (+ some noise is present) but again, these don't degrade the quality too much and I can accept the way that it works now.