I have ASP.NET app (framework 4.8), which occasionally hits 100% CPU usage for periods of couple msec. It is essential to know, that during such CPU load or right before it app does not experience client RPS bursts. It actually is serving merely a couple of client requests prior to CPU usage bursts.
Viewing perfview dump with WPA graph CPU Usage (Sampled)
, I see that tops of CPU spike as well as spikes' slides are all filled up with CPU samples stacking up from Dequeue
and TrySteal
methods. Also system metrics show that during CPU load app experiences burst of used worker threads (ThreadPool.GetAvailableThreads - ThreadPool.GetMinThreads
) up to number, I set with ThreadPool.SetMinThreads
. Machine has 16 cores, so I tested app with values of 2048 and 512 workers per all cores: 128 and 32 workers per core accordigly.
As for now, it looks like CPU load is caused by large amount of worker threads, trying to pick up any work requests when available none. So workers waste CPU trying to find work requests at their local queues, global threadpool queue, and trying to steal work from other threads' local queues.
What might cause such bursts of worker threads amount? Can 16 CPU cores really be starved with 512 workers trying to find work or it is just a consequence of any kind of other problem?
Attachments illustrate
1) CPU samples distribution among all app threads' stacks
2) CPU samples distribution among single random app thread stack