I have observed a very strange behavior when tuning SVM parameters with caret. When training a single model without tuning, SVM with radial basis kernel takes more time than SVM with linear kernel, which is expected. However, when tuning SVM with both kernels over the same penalty grid, SVM with linear kernel takes substantially more time than SVM with radial basis kernel. This behavior can be easily reproduced in both Windows and Linux with R 3.2 and caret 6.0-47. Does anyone know why tuning the linear SVM takes so much more time than the radial basis kernel SVM?
SVM linear
user system elapsed
0.51 0.00 0.52
SVM radial
user system elapsed
0.85 0.00 0.84
SVM linear tuning
user system elapsed
129.98 0.02 130.08
SVM radial tuning
user system elapsed
2.44 0.05 2.48
The toy example code is below:
library(data.table)
library(kernlab)
library(caret)
n <- 1000
p <- 10
dat <- data.table(y = as.factor(sample(c('p', 'n'), n, replace = T)))
dat[, (paste0('x', 1:p)) := lapply(1:p, function(x) rnorm(n, 0, 1))]
dat <- as.data.frame(dat)
sigmas <- sigest(as.matrix(dat[, -1]), na.action = na.omit, scaled = TRUE)
sigma <- mean(as.vector(sigmas[-2]))
cat('\nSVM linear\n')
print(system.time(fit1 <- train(y ~ ., data = dat, method = 'svmLinear', tuneLength = 1,
trControl = trainControl(method = 'cv', number = 3))))
cat('\nSVM radial\n')
print(system.time(fit2 <- train(y ~ ., data = dat, method = 'svmRadial', tuneLength = 1,
trControl = trainControl(method = 'cv', number = 3))))
cat('\nSVM linear tuning\n')
print(system.time(fit3 <- train(y ~ ., data = dat, method = 'svmLinear',
tuneGrid = expand.grid(C = 2 ^ seq(-5, 15, 5)),
trControl = trainControl(method = 'cv', number = 3))))
cat('\nSVM radial tuning\n')
print(system.time(fit4 <- train(y ~ ., data = dat, method = 'svmRadial',
tuneGrid = expand.grid(C = 2 ^ seq(-5, 15, 5), sigma = sigma),
trControl = trainControl(method = 'cv', number = 3))))
After taking a look I don't believe the issue is with
caret, but rather with whats going on behind(way behind) the scenes withkernlab.As has been stated elsewhere on stack overflow
SVMitself is an intensive algorithm. The time complexity ofSVMis O(n*n). Now this doesn't account for the difference betweenSVMcalls. What does seems to be happening though is after the call to compiled C code through a very deep stack ending inSVM > .Local > .call.(.callbeing a call to compiled c code and out of my knowledge base). Most of the time when you see unexpected slow times moving fromRtoCits because how things are passed. Since your pulling in a matrix this lends itself further to the assumption of a naming or dimensions issue causing some extra work on the other end.if we look at how this code is profiled the bottleneck becomes pretty clear.
Apologies about the font size -- its a deep stack and I think the overall shape tells the story more than the individual functions. Feel free to spam Ctrl + below.
nSVM_linearlooks like a healthy profile and lots of friendly R functions.Same deal for
nSVM radialNow once we start with 'radial tuning' we start to see the flatter structure with the
try-callstacks starting to skew but everything seems to be executing quickly.Whoa. Completely different structure for linear tuning
Ccalls taking over 100 seconds in some cases.So that being said, it looks like your bottleneck is in the compiled
Ccode fromkernlab. Since the package is connecting tolibsvmwhich seems to be pretty efficient I can't imagine there an actual issue with the code being called. Actually identifying how(safety based feature or an input issue from R) and why the issue is occurring when moving from one to the other is a job for someone better than I.