Understanding an linear classification SVM

268 Views Asked by At

I'm trying to understand how an linear classification SVM works (as in the one used for HOG people detection). I feel that I'm missing an essential part, and I'm afraid that I can't find a clear description to understand it better. I know there are ready-to-use implementations, and in the end I will probably end up using one of them, but I would like to understand what I'm doing.

As far as I understand an SVM gets trained with a number of feature vectors and correct classifications. After training the SVM is fully defined as a set of hyperplanes (with number of dimensions being the length of the feature vector), typically a very small number. I would expect (naively?) my trained SVM to be something like:

ax >=b: 0
ax < b: 1

where x is the feature vector, and ax = b the hyperplane. Here I run into problems because:

  1. I don't understand how in aforementioned paper they end up with a trained SVM that is 1.7GB. Mine would be something like (64 bit/float * (length of feature vector + 1)).
  2. Classifying using this SVM is trivial, one dot product and one comparison. Even though I can't seem to find too much information on how long matching using an SVM takes, people seem to be looking for fast implementations.

I'm sure that at some point I misunderstood what I read, however I would like to know where I went wrong in my thinking. I guess I'm just stuck in the wrong mindset, because the more I read about SVMs, the more I see above description confirmed, and this just can't be right.

2

There are 2 best solutions below

2
On BEST ANSWER

It looks like in the paper they needed 1.7 GB of RAM to train a classifier. To do that they had to load about 14000 of 64x128 RGB image patches. Which ends up about 1.5 GB when they are stored using integers.

Once classifier is computed you right, only one weight vector is needed to check on which side of a hyper plane a given sample is.

0
On

Linear SVM is a special case of general soft margin kernel SVM in which the model can be expressed as a single weight vector w and a bias b, so that classification is done by y = w'*x+b (this is the decision value; you can threshold it with zero, as you did, or choose another value to tradeoff precision/recall). In the general case, the model is described by its support vectors and their weights, so its size depends on the number of SVs that were found, and may be quite large.

Some software libraries, like libsvm, do not have specialized code to deal with linear SVM classification, so the model representation and classification function are inefficient in terms of memory and running time. You can, however, easily convert the model representation to the above linear classifier, by calculated the weighted sum of all the SVs, using the stored SV weight. You can then write your own linear classification function, using whatever means you want to make it run fast (e.g. vectorized code)