Using an average of VADER and textBlob's sentiment polarity gives me a more accurate result, why?

342 Views Asked by At

I have a manually labelled set of ~120K tweets. If I use VADER's compound score it only matches the manual labelling for ~24% of the records, textblob matches ~35% of the manually labelled record. If I take Vaders compound score and textblobs score and add then together and divide by 2 the resulting sentiment result matches the manual labelling ~70% of the time. Is there any reason for why its more accurate or is it just coincidence?

1

There are 1 best solutions below

0
On

I think you're stumbling upon the idea behind ensemble learning. More often than not, putting multiple models together and combining their predictions leads to better results. Your implementation could be thought of as an equally weighted soft-voting ensemble. For more examples and additional implementations, the scikit-learn Voting Classifier docs are great.