Different levels of accuracy for three classes in naive bayes

321 Views Asked by At

I am new to machine learning. I am classifying tweets in three classes using term frequency feature. My training and test data are balanced for all classes and it's ratio is 70-30% for all classes. But accuracy for all three classes is so different like class one high.. 92% accurate, class two medium.. 53% accurate and class three very low.. 15% accurate. Can anyone please tell what can possibly be wrong with my algorithm? I have three classes information neutral and metaphor.

`# Create term document matrix
tweets.information.matrix <- t(TermDocumentMatrix(tweets.information.corpus,control = list(wordLengths=c(4,Inf))));
tweets.metaphor.matrix <- t(TermDocumentMatrix(tweets.metaphor.corpus,control = list(wordLengths=c(4,Inf))));
tweets.neutral.matrix <- t(TermDocumentMatrix(tweets.neutral.corpus,control = list(wordLengths=c(4,Inf))));
tweets.test.matrix <- t(TermDocumentMatrix(tweets.test.corpus,control = list(wordLengths=c(4,Inf))));`

and that is how it calculates the probability

     `probabilityMatrix <-function(docMatrix)
{
  # Sum up the term frequencies
  termSums<-cbind(colnames(as.matrix(docMatrix)),as.numeric(colSums(as.matrix(docMatrix))))
  # Add one
  termSums<-cbind(termSums,as.numeric(termSums[,2])+1)
  # Calculate the probabilties
  termSums<-cbind(termSums,(as.numeric(termSums[,3])/sum(as.numeric(termSums[,3]))))
  # Calculate the natural log of the probabilities
  termSums<-cbind(termSums,log(as.numeric(termSums[,4])))
  # Add pretty names to the columns
  colnames(termSums)<-c("term","count","additive","probability","lnProbability")
  termSums
}

`

results for information are highly accurate , for neutral moderately accurate nad for Metaphor results are very low

0

There are 0 best solutions below