Why is not allowed to use n_jobs at SKLearn's Support Vector Regressor?

44 Views Asked by At

According to the documentation, using the parameter n_jobs on a SVR is not supported.

On the other hand, and again from the docs, that parameter is supported at other types of regressor.

  1. Why is this?
  2. What is the best method to speed up a regressor in that case?

This is the error traceback I'm getting:

regressor = SVR(kernel='rbf', C=1, n_jobs=-1)
TypeError: __init__() got an unexpected keyword argument 'n_jobs'
1

There are 1 best solutions below

0
Muhammed Yunus On

I've got some suggestions in relation to your second question.

SVR is slow for lots of samples (tens of thousands), so one option is to resample your data down to a smaller representative sample using stratified sampling.

Look into scikit-learn-intelex - if applicable to your platform it will give you a performance boost in terms of speed.

Below I outline some options that approximate SVR and perform faster.

  1. What is the best method to speed up a regressor in that case?

To speed up support vector regression specifically, you could try running an algorithm that approximates it using a linear model. The snippet below approximates the effect of an RBF kernel using Nystroem, and feeds those RBF features into a LinearSVR. Computation time for LinearSVR scales linearly with the number of samples, making it faster than SVR.

You can adjust these primary factors to trade accuracy for speed:

  • n_components= in Nystroem
  • Replacing Nystroem with RBFSampler (latter is generally faster but may be less accurate)
  • tol= can also be increased to trade accuracy for speed. This also applies to various regularisation parameters.
  • Could try replacing LinearSVR with SGDRegressor(loss='epsilon_insensitive'). It might not converge as quickly, but it gives you another set of dials to tune if wanted.
from sklearn.kernel_approximation import Nystroem, RBFSampler
from sklearn.svm import LinearSVR

from sklearn.pipeline import make_pipeline

#np.random.seed(0) #makes example reproducible

rbf_svr_approx = make_pipeline(
    Nystroem(kernel='rbf', n_components=100, n_jobs=-1),
    LinearSVR(),
)