I am trying to build a classifier using SVM light which classifies a document in one of the two classes. I have already trained and tested the classifier and a model file is saved to the disk. Now I want to use this model file to classify completely new documents. What should be the input file format for this? Could it be plain text file (I don't think that would work) or could be it just plain listing of features present in the text file without any class label and feature weights (in that case I have to keep track of the indices of features in feature vector during training) or is it some other format?
File format for classification using SVM light
14.6k Views Asked by ritesh At
2
There are 2 best solutions below
0

The file format to make predictions is the same as the one to make test and train, i.e.
<line> .=. <target> <feature>:<value> ... <feature>:<value> # <info>
<target> .=. +1 | -1 | 0 | <float>
<feature> .=. <integer> | "qid"
<value> .=. <float>
<info> .=. <string>
But to make prediction the target is unknow, thus you have to use 0 value as target. Thi is the only difference. I hope this helps someone
Training and testing files must be of the same format, each instance results in a line of the following form:
For example (copy pasta from SVM^light website):
You can consult the SVM^light website for more information.