I try to do text classifying using weka library in my java program, but i have a bit problem
This my training data, there are 5 data, and two classes:
@relation hamspam
@attribute text string
@attribute class {ham,spam}
@data
'good',ham
'very good',ham
'bad',spam
'very bad',spam
'very bad, very bad',spam
This is my testing data, there are three data:
@relation hamspam
@attribute text string
@attribute class {ham,spam}
@data
'good bad very bad',?
'good good good',?
'good very good',?
And this is my code:
public static void loader() throws FileNotFoundException, IOException, Exception{
//filter
StringToWordVector filter = new StringToWordVector();
Classifier j48tree = new J48();
//training data
Instances train = new Instances(new BufferedReader(new FileReader("D:/trainingdata.arff")));
int lastIndex = train.numAttributes() - 1;
train.setClassIndex(lastIndex);
filter.setInputFormat(train);
train = Filter.useFilter(train, filter);
//testing data
Instances test = new Instances(new BufferedReader(new FileReader("D:/testingdata.arff")));
test.setClassIndex(lastIndex);
filter.setInputFormat(test);
test = Filter.useFilter(test, filter);
j48tree.buildClassifier(train);
for(int i=0; i<test.numInstances(); i++) {
double index = j48tree.classifyInstance(test.instance(i));
String className = train.attribute(lastIndex).value((int)index);
System.out.println(className);
}
}
i try to predict the className and print it, but the className is not appear. what's wrong with my code?