FFMPeg generated video: Audio has 'glitches' when uploaded to YouTube

82 Views Asked by At

I've generated a voice from Azure AI Speech at 48KHz and 96K Bit Rate, generated a video of some stock footages and I'm trying to combine all of that with a background music. The voice-over is generated per setence, so that I know how long each setence is and to include relevant video footage.

I'm using FFMpeg through the FFMpegCore nuget package.

The problem

After the video is complete with background music, I play it on my computer and it's perfect (no audio glitches, music keeps playing). But when uploaded to youtube it has 'breaks' in the music inbetween sentences (basically everytime a new voice-fragment is starting).

Example: https://www.youtube.com/watch?v=ieNvQ2TNq44

The code

All of the footage is combined with mostly FFMpeg.Join(string output, string[] videos). These video files also contain the voice-overs (per sentance).

After that I try to add the music like this:

   string outputTimelineWithMusicPath = _workingDir + $@"\{videoTitle}_withmusic.mp4";
    FFMpegArguments
        .FromFileInput(inputVideoPath)
        .AddFileInput(musicPath)
        .OutputToFile(outputPath, true, options => options
            .CopyChannel()
            .WithAudioCodec(AudioCodec.Aac)
            .WithAudioBitrate(AudioQuality.Good)
            .UsingShortest(true)
            .WithCustomArgument("-filter_complex \"[0:a]aformat=fltp:44100:stereo,apad[0a];[1]aformat=fltp:44100:stereo,volume=0.05[1a];[0a][1a]amerge[a]\" -map 0:v -map \"[a]\" -ac 2"))
        .ProcessSynchronously();

I've tried to mess around with the CustomArgument, but so far no success.

For example, I thought removing apad from the argument so no 'blank spots' are added, should perhaps fix the issue. Also tried to use amix instead of amerge.

Last try

I've tried to first make sure both files had the same sample rate, in the hope to fix the issue. So far, no success

    string outputVideoVoicePath = _workingDir + $@"\{title}_voiceonly_formatting.mp4";
    string musicReplacePath = _workingDir + $@"\{title}_music_formatted.aac";
    FFMpegArguments
    .FromFileInput(inputVideoPath)
    .OutputToFile(outputVideoVoicePath, true, options => options
        .WithAudioCodec(AudioCodec.Aac)
        .WithAudioBitrate(128)
        .WithAudioSamplingRate(44100)
    )
    .ProcessSynchronously();
    
    FFMpegArguments
        .FromFileInput(music.FilePath)
        .OutputToFile(musicReplacePath, true, options => options
            .WithAudioCodec(AudioCodec.Aac)
            .WithAudioBitrate(256) //also tried 96 (which is original format)
            .WithAudioSamplingRate(44100)
        )
        .ProcessSynchronously();
    
    
    Console.WriteLine("Add music...");
    var videoTitle = Regex.Replace(title, "[^a-zA-Z]+", "");
    string outputTimelineWithMusicPath = _workingDir + $@"\{videoTitle}_withmusic.mp4";
    FFMpegArguments
        .FromFileInput(outputVideoVoicePath)
        .AddFileInput(musicReplacePath)
        .OutputToFile(outputTimelineWithMusicPath, true, options => options
            .CopyChannel()
            .WithAudioCodec(AudioCodec.Aac)
            .WithAudioBitrate(AudioQuality.Good)
            .UsingShortest(true)
            .WithCustomArgument("-filter_complex \"[0:a]aformat=fltp:44100:stereo[0a];[1]aformat=fltp:44100:stereo,volume=0.05[1a];[0a][1a]amix=inputs=2[a]\" -map 0:v -map \"[a]\" -ac 2"))
        .ProcessSynchronously();
    return outputTimelineWithMusicPath;

I'm not much of an expert when it comes to audio/video codecs. I do scale each stock video to 24fps, 1920x1080 and the music has a original bitrate of 256Kbps / 44100 sample rate (so I probably don't even have to convert the audio file).

0

There are 0 best solutions below