Druid - Failed to get storage slot due to error [Unable to pick a free slot, this should never happen]

80 Views Asked by At

Tasks are getting failed with the following error in Druid, The tasks are not even scheduled and middle manager is getting restarted.

Failed to get storage slot due to error [Unable to pick a free slot, this should never happen There is enough storage and everything.

Could someone please help with what does this mean and when does it come?

2

There are 2 best solutions below

0
On

This is mainly because your historical/middlemanager config details does not match with exact storage available on local Server.

Please check below config setting in Middlemanager's runtime.properties file and reduce the number.

druid.worker.capacity

Also, verify historical's runtime.properties druid.segmentCache.locations and its maxSize specified in the application.

0
On

A point of clarity that druid.worker.capacity is an internal configuration setting that tells Druid how many slots are available for tasks in a given MiddleManager, rather than telling Druid about the amount of storage that's available.

While it can be set manually, it is calculated automatically on start-up of the MiddleManager process based on the number of free CPUs.

Nowadays, if this comes up, I now just look at:

  • Adding another MiddleManager process entirely.
  • Increasing the number of CPUs on the MiddleManager box.

Re: the original error, do you also see:

Failed to get directory for task [index_kafka_something_somethingsomething_darkside], cannot schedule.

I've seen this crop up when people are running multiple MiddleManagers (or Indexers) on the same hardware. E.g. if they have tiered ingestion enabled. In that case, the solution was to check druid.indexer.task.baseDir to ensure that there is a unique baseDir for each running MM on the node.