sentimentr - different results for different text partitioning

44 Views Asked by At

Using sentimentr to analyse the text:

I haven’t been sad in a long time. I am extremely happy today. It’s a good day.

I first used a sentence by sentence partitioning of the text

library(sentimentr)

ase1 <- c(
  "I haven't been sad in a long time.",
  "I am extremely happy today.",
  "It's a good day."
)

part1 <- get_sentences(ase1)
sentiment(part1)

   element_id sentence_id word_count sentiment
1:          1           1          8 0.1767767
2:          2           1          5 0.6037384
3:          3           1          4 0.3750000

then used one block of text

ase2 <- c(
  "I haven’t been sad in a long time. I am extremely happy today. It’s a good day.")

part2 <- get_sentences(ase2)
sentiment(part2)

   element_id sentence_id word_count   sentiment
1:          1           1          9 -0.03333333
2:          1           2          5  0.60373835
3:          1           3          5  0.33541020

Same text, difference in word count and in sentiment score

Please advise?

1

There are 1 best solutions below

0
On BEST ANSWER

Not completely the same text. In the first example you use ', but in the second text you use . These are completely different quotes and have different meaning in text mining.

The example below returns the same results as in your first example.

ase2 <- c(
  "I haven't been sad in a long time. I am extremely happy today. It's a good day.")

part2 <- get_sentences(ase2)
sentiment(part2)
   element_id sentence_id word_count sentiment
1:          1           1          8 0.1767767
2:          1           2          5 0.6037384
3:          1           3          4 0.3750000