I have observed a very strange behavior when tuning SVM parameters with caret
. When training a single model without tuning, SVM with radial basis kernel takes more time than SVM with linear kernel, which is expected. However, when tuning SVM with both kernels over the same penalty grid, SVM with linear kernel takes substantially more time than SVM with radial basis kernel. This behavior can be easily reproduced in both Windows and Linux with R 3.2 and caret
6.0-47. Does anyone know why tuning the linear SVM takes so much more time than the radial basis kernel SVM?
SVM linear
user system elapsed
0.51 0.00 0.52
SVM radial
user system elapsed
0.85 0.00 0.84
SVM linear tuning
user system elapsed
129.98 0.02 130.08
SVM radial tuning
user system elapsed
2.44 0.05 2.48
The toy example code is below:
library(data.table)
library(kernlab)
library(caret)
n <- 1000
p <- 10
dat <- data.table(y = as.factor(sample(c('p', 'n'), n, replace = T)))
dat[, (paste0('x', 1:p)) := lapply(1:p, function(x) rnorm(n, 0, 1))]
dat <- as.data.frame(dat)
sigmas <- sigest(as.matrix(dat[, -1]), na.action = na.omit, scaled = TRUE)
sigma <- mean(as.vector(sigmas[-2]))
cat('\nSVM linear\n')
print(system.time(fit1 <- train(y ~ ., data = dat, method = 'svmLinear', tuneLength = 1,
trControl = trainControl(method = 'cv', number = 3))))
cat('\nSVM radial\n')
print(system.time(fit2 <- train(y ~ ., data = dat, method = 'svmRadial', tuneLength = 1,
trControl = trainControl(method = 'cv', number = 3))))
cat('\nSVM linear tuning\n')
print(system.time(fit3 <- train(y ~ ., data = dat, method = 'svmLinear',
tuneGrid = expand.grid(C = 2 ^ seq(-5, 15, 5)),
trControl = trainControl(method = 'cv', number = 3))))
cat('\nSVM radial tuning\n')
print(system.time(fit4 <- train(y ~ ., data = dat, method = 'svmRadial',
tuneGrid = expand.grid(C = 2 ^ seq(-5, 15, 5), sigma = sigma),
trControl = trainControl(method = 'cv', number = 3))))
After taking a look I don't believe the issue is with
caret
, but rather with whats going on behind(way behind) the scenes withkernlab
.As has been stated elsewhere on stack overflow
SVM
itself is an intensive algorithm. The time complexity ofSVM
is O(n*n). Now this doesn't account for the difference betweenSVM
calls. What does seems to be happening though is after the call to compiled C code through a very deep stack ending inSVM > .Local > .call.
(.call
being a call to compiled c code and out of my knowledge base). Most of the time when you see unexpected slow times moving fromR
toC
its because how things are passed. Since your pulling in a matrix this lends itself further to the assumption of a naming or dimensions issue causing some extra work on the other end.if we look at how this code is profiled the bottleneck becomes pretty clear.
Apologies about the font size -- its a deep stack and I think the overall shape tells the story more than the individual functions. Feel free to spam Ctrl + below.
nSVM_linear
looks like a healthy profile and lots of friendly R functions.Same deal for
nSVM radial
Now once we start with 'radial tuning' we start to see the flatter structure with the
try-call
stacks starting to skew but everything seems to be executing quickly.Whoa. Completely different structure for linear tuning
C
calls taking over 100 seconds in some cases.So that being said, it looks like your bottleneck is in the compiled
C
code fromkernlab
. Since the package is connecting tolibsvm
which seems to be pretty efficient I can't imagine there an actual issue with the code being called. Actually identifying how(safety based feature or an input issue from R) and why the issue is occurring when moving from one to the other is a job for someone better than I.