ffmpeg headphone convolution audio filter with non standard channel layouts

639 Views Asked by At

I am successfully using ffmpeg's headphone convolution audio filter (see ffmpeg filters documentation section 8.78) to make binaural versions of surround sound with up to 16 channels. As long as the channel layouts are recognized by ffmpeg.

I have 32 IRs (16 stereo pairs) in one file and can use it for:

Stereo 5.1, 7.1, and 7.1.4.4 (16 channels).

The order of the pairs in the IR file is:

FL|FR|FC|LFE|BL|BR|SL|SR|FLC|FRC|TFL|TFR|BC|TC|TBL|TBR

So all of those formats are subsets of the IR file, and in order.

Surround input files, with more than 8 channels, have channels assigned in their headers by previously applied ffmpeg commands.

Such as:

ffmpeg.exe" -y -i %%x -acodec pcm_s24le  -af "pan=7.1+TFL+TFR+TBL+TBR|FL=c0|FR=c1|FC=c2|LFE=c3|SL=c6|SR=c7|BL=c4|BR=c5|TFL=c8|TFR=c9|TBL=c10|TBR=c11" "%%~nx_12ch_mapped.wav"

for 12 channels and:

ffmpeg.exe" -y -i %%x -acodec pcm_s24le -filter_complex "pan=FL+FR+FC+LFE+BL+BR+SL+SR+FLC+FRC+TFL+TFR+BC+TC+TBL+TBR|FL=c0|FR=c1|FC=c2|LFE=c3|BL=c4|BR=c5|SL=c6|SR=c7|FLC=c8|FRC=c9|TFL=c10|TFR=c11|BC=c12|TC=c13|TBL=c14|TBR=c15" "%%~nx_16ch_mapped.wav"

for 16 channels.

So here is a working convolution command. Again it works for stereo, 5.1, 7.1 and 16 channel surround input files (only the gain needs changing for the different channel count inputs):

ffmpeg.exe" -hide_banner -y -i "%%~nx_48KHz.wav" -i "%parent%\My-IR.wav" -filter_complex "[1:0]pan=stereo|c0=c0|c1=c1[fl];[1:0]pan=stereo|c0=c2|c1=c3[fr];[1:0]pan=stereo|c0=c4|c1=c5[fc];[1:0]pan=stereo|c0=c6|c1=c7[lfe];[1:0]pan=stereo|c0=c8|c1=c9[bl];[1:0]pan=stereo|c0=c10|c1=c11[br];[1:0]pan=stereo|c0=c12|c1=c13[sl];[1:0]pan=stereo|c0=c14|c1=c15[sr];[1:0]pan=stereo|c0=c16|c1=c17[flc];[1:0]pan=stereo|c0=c18|c1=c19[frc];[1:0]pan=stereo|c0=c20|c1=c21[tfl];[1:0]pan=stereo|c0=c22|c1=c23[tfr];[1:0]pan=stereo|c0=c24|c1=c25[bc];[1:0]pan=stereo|c0=c26|c1=c27[tc];[1:0]pan=stereo|c0=c28|c1=c29[tbl];[1:0]pan=stereo|c0=c30|c1=c31[tbr];[0:a][fl][fr][fc][lfe][bl][br][sl][sr][flc][frc][tfl][tfr][bc][tc][tbl][tbr]headphone=gain=32:map=FL|FR|FC|LFE|BL|BR|SL|SR|FLC|FRC|TFL|TFR|BC|TC|TBL|TBR[fh]" -map [fh]:a  "%%~nx_h_binaural.flac"

Which gives results:

Guessed Channel Layout for Input Stream #0.0 : hexadecagonal
Input #0, wav, from 'Black Hole Sun_7.1.4.4_16ch_mapped_48KHz.wav':
  Duration: 00:05:19.19, bitrate: 24576 kb/s
    Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, hexadecagonal, flt, 24576 kb/s
Input #1, wav, from 'D:\Google Drive\16ch Virtual_Surround\\My-IR.wav':
  Duration: 00:00:01.00, bitrate: 49194 kb/s
    Stream #1:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, 32 channels, flt, 49152 kb/s
Stream mapping:
  Stream #0:0 (pcm_f32le) -> headphone:in0
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  headphone -> Stream #0:0 (flac)
Press [q] to stop, [?] for help
[Parsed_pan_0 @ 000001c26da4b180] Pure channel mapping detected: 0 1
[Parsed_pan_1 @ 000001c26da4b680] Pure channel mapping detected: 2 3
[Parsed_pan_2 @ 000001c26da4b780] Pure channel mapping detected: 4 5
[Parsed_pan_3 @ 000001c26da4bc80] Pure channel mapping detected: 6 7
[Parsed_pan_4 @ 000001c26da4ab80] Pure channel mapping detected: 8 9
[Parsed_pan_5 @ 000001c26da4b280] Pure channel mapping detected: 10 11
[Parsed_pan_6 @ 000001c26da4a280] Pure channel mapping detected: 12 13
[Parsed_pan_7 @ 000001c26da4b980] Pure channel mapping detected: 14 15
[Parsed_pan_8 @ 000001c26da4af80] Pure channel mapping detected: 16 17
[Parsed_pan_9 @ 000001c26da4b880] Pure channel mapping detected: 18 19
[Parsed_pan_10 @ 000001c26da4b080] Pure channel mapping detected: 20 21
[Parsed_pan_11 @ 000001c26da4be80] Pure channel mapping detected: 22 23
[Parsed_pan_12 @ 000001c26da4ba80] Pure channel mapping detected: 24 25
[Parsed_pan_13 @ 000001c26da4a180] Pure channel mapping detected: 26 27
[Parsed_pan_14 @ 000001c26da4b380] Pure channel mapping detected: 28 29
[Parsed_pan_15 @ 000001c26da4a580] Pure channel mapping detected: 30 31
[flac @ 000001c26d605440] encoding as 24 bits-per-sample
Output #0, flac, to 'Black Hole Sun_7.1.4.4_16ch_mapped_h_binaural.flac':
  Metadata:
    encoder         : Lavf58.33.100
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s32 (24 bit), 128 kb/s (default)
    Metadata:
      encoder         : Lavc58.59.102 flac
size=   64546kB time=00:05:19.18 bitrate=1656.6kbits/s speed=3.12x
video:0kB audio:64538kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.012541%

However for channel layouts that need to skip some of the IRs, such as 4.0, 5.0, and 7.1.4 (12 channels) I can't figure out how to make it work. In the 12 channel case, I even cut the IR file down to 24 channels (12 pairs) and tried this:

ffmpeg.exe" -hide_banner -y -i "%%~nx_48KHz.wav" -i "%parent%\IR-Files\A16_12ch_IR 12_Pairs_A16_Order.wav" -filter_complex "[1:0]pan=stereo|c0=c0|c1=c1[fl];[1:0]pan=stereo|c0=c2|c1=c3[fr];[1:0]pan=stereo|c0=c4|c1=c5[fc];[1:0]pan=stereo|c0=c6|c1=c7[lfe];[1:0]pan=stereo|c0=c8|c1=c9[bl];[1:0]pan=stereo|c0=c10|c1=c11[br];[1:0]pan=stereo|c0=c12|c1=c13[sl];[1:0]pan=stereo|c0=c14|c1=c15[sr];[1:0]pan=stereo|c0=c16|c1=c17[tfl];[1:0]pan=stereo|c0=c18|c1=c19[tfr];[1:0]pan=stereo|c0=c20|c1=c21[tbl];[1:0]pan=stereo|c0=c22|c1=c23[tbr];[0:a][fl][fr][fc][lfe][bl][br][sl][sr][tfl][tfr][tbl][tbr]headphone=gain=10:map=FL|FR|FC|LFE|BL|BR|SL|SR|TFL|TFR|TBL|TBR[fh]" -map [fh]:a  "%%~nx_h_binaural.flac"

But it gives the following output:

Input #0, wav, from 'Mistral Wind m_12ch_mapped norm 0dB_48KHz.wav':
  Duration: 00:07:26.43, bitrate: 18432 kb/s
    Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, 12 channels, flt, 18432 kb/s
Input #1, wav, from 'D:\Google Drive\16ch Virtual_Surround\\IR-Files\A16_12ch_IR 12_Pairs_A16_Order.wav':
  Duration: 00:00:01.00, bitrate: 36866 kb/s
    Stream #1:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, 24 channels, flt, 36864 kb/s
Stream mapping:
  Stream #0:0 (pcm_f32le) -> headphone:in0
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  Stream #1:0 (pcm_f32le) -> pan
  headphone -> Stream #0:0 (flac)
Press [q] to stop, [?] for help
[auto_resampler_0 @ 000002a4ffb4f980] Cannot select channel layout for the link between filters auto_resampler_0 and Parsed_headphone_12.
[auto_resampler_0 @ 000002a4ffb4f980] Unknown channel layouts not supported, try specifying a channel layout using 'aformat=channel_layouts=something'.
Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
Error while processing the decoded data for stream #0:0
Conversion failed!

and I can't figure out where to put the suggested aformat=channel_layouts=something.

Suggestions?

0

There are 0 best solutions below