discord.py music bot slowing down for longer audio queries

557 Views Asked by At

So I'm trying to make a music bot with discord.py. Shown below is a minimum working example of the bot with the problematic functions:

import os

import discord
from discord.ext import commands
from discord import player as p

import yt_dlp as youtube_dl

intents = discord.Intents.default()
intents.members = True

bot = commands.Bot(command_prefix=';')

class Music(commands.Cog):
    def __init__(self, bot):
        self.bot = bot
        self.yt-dlp_opts = {
            'format': 'bestaudio/best',
            'outtmpl': '%(extractor)s-%(id)s-%(title)s.%(ext)s',
            'restrictfilenames': True,
            'noplaylist': True,
            'playlistend': 1,
            'nocheckcertificate': True,
            'ignoreerrors': False,
            'logtostderr': False,
            'quiet': True,
            'no_warnings': True,
            'default_search': 'auto',
            'source_address': '0.0.0.0', # bind to ipv4 since ipv6 addresses cause issues sometimes
        }
        self.ffmpeg_opts = {
            'options': '-vn',
            # Source: https://stackoverflow.com/questions/66070749/
            "before_options": "-reconnect 1 -reconnect_streamed 1 -reconnect_delay_max 5",
        }
        self.cur_stream = None
        self.cur_link = None

    @commands.command(aliases=["p"])
    async def play(self, ctx, url):
        yt-dlp = youtube_dl.YoutubeDL(self.ytdl_opts)
        data = yt-dlp.extract_info(url, download=False)
        filename = data['url']  # So far only works with links
        print(filename)
        audio = p.FFmpegPCMAudio(filename, **self.ffmpeg_opts)
        self.cur_stream = audio
        self.cur_link = filename

        # You must be connected to a voice channel first
        await ctx.author.voice.channel.connect()
        ctx.voice_client.play(audio)
        await ctx.send(f"now playing")

    @commands.command(aliases=["ff"])
    async def seek(self, ctx):
        """
        Fast forwards 10 seconds
        """
        ctx.voice_client.pause()
        for _ in range(500):
            self.cur_stream.read()  # 500*20ms of audio = 10000ms = 10s
        ctx.voice_client.resume()

        await ctx.send(f"fast forwarded 10 seconds")

    @commands.command(aliases=["j"])
    async def jump(self, ctx, time):
        """
        Jumps to a time in the song, input in the format of HH:MM:SS
        """
        ctx.voice_client.stop()
        temp_ffempg = {
            'options': '-vn',
            # Keyframe skipping when passed as an input option (fast)
            "before_options": f"-ss {time} -reconnect 1 -reconnect_streamed 1 -reconnect_delay_max 5",
        }
        new_audio = p.FFmpegPCMAudio(self.cur_link, **temp_ffempg)
        self.cur_stream = new_audio
        ctx.voice_client.play(new_audio)
        await ctx.send(f"skipped to {time}")


bot.add_cog(Music(bot))
bot.run(os.environ["BOT_TOKEN"])

My requirements.txt file:

discord.py[voice]==1.7.3
yt-dlp==2021.9.2

To play a song in Discord the following format is used:

;p <link>

Where <link> is any link that yt-dlp supports. Under normal circumstances, the ;p command is used with songs that are relatively short, to which seek() and jump() work extremely quickly to do what they are supposed to do. For example if I execute these sequence of commands in Discord:

;p https://www.youtube.com/watch?v=n8X9_MgEdCg  <- 4 min song

And when the bot starts playing, spam the following:

;ff
;ff
;ff
;ff
;ff

The bot is able to almost instantly seek five 10-second increments of the song. Additionally, I can jump to the three minute mark very quickly with:

;j 00:03:00

From some experimentation, the seek() and jump() functions seem to work quickly for songs that are under 10 minutes. If I try the exact same sequence of commands but with a 15 minute song like https://www.youtube.com/watch?v=Ks9Ck5LfGWE or longer https://www.youtube.com/watch?v=VThrx5MRJXA (10 hours classical music), there is an evident slowdown when running the ;ff command. However, when I include a few seconds of delay between firings of the ;ff command, the seeking is just as fast as previously mentioned. I'm not exactly sure what is going on with yt-dlp/FFmpeg behind the scenes when streaming, but I speculate that there is some sort of internal buffer, and songs that pass a certain length threshold are processed differently.

For longer songs, the seek() command takes longer to get to the desired position, which makes sense since this site specifies that -ss used as an input option loops through keyframes (as there must be more keyframes in longer songs). However, if the following commands are run in Discord:

;p https://www.youtube.com/watch?v=VThrx5MRJXA  <- 10 hour classical music
;j 09:00:00                                     <- jump to 9 hour mark
;j 00:03:00                                     <- jump to 3 minute mark

The first seek command takes around 5 to 10 seconds to perform a successful seek, which isn't bad, but it could be better. The second seek command takes around the same time as the first command, which doesn't make sense to me, because I thought less keyframes were skipped in order to reach the 3 minute mark.

So I'm wondering what's going on, and how to potentially solve the following:

  • What is actually going on with the seek() command? My implementation of seek() uses discord.py's discord.player.FFmpegPCMAudio.read() method, which apparently runs slower if the song's length is longer? Why?
  • Why does input seeking for long YouTube videos take almost the same time no matter where I seek to?
  • How the yt-dlp and FFmpeg commands work behind the scenes to stream a video from YouTube (or any other website that YTDL supports). Does yt-dlp and FFmpeg behave differently for audio streams above a certain length threshold?
  • Potential ways to speed up seek() and jump() for long songs. I recall some well-known discord music bots were able to do this very quickly.
0

There are 0 best solutions below