I am profiling Sklearn model:
clf = GridSearchCV(..., n_jobs=-1)
%time clf.fit(X_train, y_train)
...
CPU times: user 2min 35s, sys: 3.07 s, total: 2min 38s
Wall time: 8min 40s
Wall Time is significantly larger than CPU total time.
Does it mean, that Sklearn is not fully utilizing CPU resources? I haven't any programs on my PC started explicitly, except Jupyter Notebook.
How do I can increase CPU priority for all processes, that Sklearn have started?
OS: Kubuntu 22.04
Much higher wall time than CPU time usually indicates an I/O bottleneck, but that should not happen training a scikit-learn model on data you have in memory. The next thing I would try is setting
n_jobsto the number of physical CPU cores.