Accuracy of lexicon-based sentiment analysis

1.9k Views Asked by mitalip At 01 July 2025 at 14:33

I'm performing different sentiment analysis techniques for a set of Twitter data I have acquired. They are lexicon based (Vader Sentiment and SentiWordNet) and as such require no pre-labeled data.

I was wondering if there was a method (like F-Score, ROC/AUC) to calculate the accuracy of the classifier. Most of the methods I know require a target to compare the result to.

Original Q&A

There are 2 best solutions below

Darren Cook On 19 December 2017 at 10:38

The short answer is no, I don't think so. (So, I'd be very interested if someone else posts a method.)

With some unsupervised machine learning techniques you can get some measurement of error. E.g. an autoencoder gives you an MSE (representing how accurately the lower-dimensional representation can be reconstructed back to the original higher-dimensional form).

But for sentiment analysis all I can think of is to use multiple algorithms and measure agreement between them on the same data. Where they all agree on a particular sentiment you mark it as more reliable prediction, where they all disagree you mark it as unreliable prediction. (This relies on none of the algorithms have the same biases, which is probably unlikely.)

The usual approach is to label some percentage of your data, and assume/hope it is representative of the whole data.

Josh Dando On 22 April 2018 at 13:41

What I did for my research is take a small random sample of those tweets and manually label them as either positive or negative. You can then calculate the normalized scores using VADER or SentiWordNet and compute the confusion matrix for each which will give you your F-score etc.

Although this may not be a particularly good test, as it depends on the sample of tweets you use. For example you may find that SentiWordNet classes more things as negative than VADER and thus appears to have the higher accuracy if your random sample are mostly negative.

Accuracy of lexicon-based sentiment analysis

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in NLTK

Related Questions in SENTIMENT-ANALYSIS

Related Questions in SENTI-WORDNET

Related Questions in VADER

Trending Questions

Popular # Hahtags

Popular Questions