Slurmd daemon start error: Couldn't find the specified plugin name for cgroup/v2 looking at all files

75 Views Asked by At

I got on my server nodes after a couple months to work on them again and now the slurmd daemon won't start on any of the nodes. My slurmctld is working fine. I have the cgroup.conf file in the slurm directory. Here is the config file:

    #CgroupAutomount=yes
    ConstrainCores=no
    ConstrainRAMSpace=no

I get the same error regardless of whether its v2 or just automount set to yes and plugin commented out.

Here is the error output:

    Couldn't find the specified plugin name for cgroup/v2 looking at all files
    slurmd[587248]: slurmd: error: cannot find cgroup plugin for cgroup/v2
    slurmd[587248]: slurmd: error: cannot create cgroup context for cgroup/v2
    slurmd[587248]: slurmd: error: Unable to initialize cgroup plugin
    slurmd[587248]: slurmd: error: slurmd initialization failed 

I had previously had the cgroup set to v1, but was getting this error:

slurmd[1535]: slurmd: CPU frequency setting not configured for this node
slurmd[1535]: slurmd: error: unable to open '/sys/fs/cgroup/freezer//tasks' for reading : No such file or directory
slurmd[1535]: slurmd: error: cgroup namespace 'freezer' not mounted. aborting
slurmd[1535]: slurmd: error: unable to create freezer cgroup namespace
slurmd: error: Couldn't load specified plugin name for proctrack/cgroup: Plugin init() callback failed
slurmd[1535]: slurmd: error: cannot create proctrack context for proctrack/cgroup
slurmd[1535]: slurmd: error: slurmd initialization failed 

So I switched to v2, hence my current error. Any suggestions or help is appreciated.

Update: I changed the config file to

CgroupPlugin=cgroup/v1
CgroupAutomount=yes
ConstrainCores=no
ConstrainRAMSpace=no
CgroupMountpoint=/sys/fs/cgroup

And now the daemon can run/be active, however, there are still some errors related to freezer.

0

There are 0 best solutions below