Change value of words in bing lexicon

136 Views Asked by At

I'm analyzing a survey using R Studio. I'm using Bing Sentiment lexicon from tidytext package to do so.

Some words don't have the right meaning for my survey, specifically 'tender' is coded as positive, but my respondents mean 'tender' as a negative (pain). I know how to remove a word from the bing tibble, and add a new one, but how can I simply change the meaning of the word?

For example:

structure(list(word = c("pain", "tender", "sensitive", "headaches", 
"like", "anxiety"), sentiment = c("negative", "positive", "positive", 
"negative", "positive", "negative"), n = c(351L, 305L, 279L, 
220L, 200L, 196L)), row.names = c(NA, 6L), class = "data.frame")

I want it to look like:

structure(list(word = c("pain", "tender", "sensitive", "headaches", 
"like", "anxiety"), sentiment = c("negative", "negative", "positive", 
"negative", "positive", "negative"), n = c(351L, 305L, 279L, 
220L, 200L, 196L)), row.names = c(NA, 6L), class = "data.frame")

Thank you!

2

There are 2 best solutions below

0
Phil On BEST ANSWER

Running the line

df$sentiment <- ifelse(df$word == "tender", "positive", df$sentiment)

will effectively change the sentiment vector for any instance in which the word vector is "tender" so that it shows as "positive". Any other instance will remain as is.

Note that if there are other words that you would also like to change their sentiment to positive, you can do:

df$sentiment <- ifelse(df$word %in% c("tender", "anotherword", "etc"), "positive", df$sentiment)
0
JBGruber On

The way to do this kind of recoding in the tidyverse (on which tidytext builds) is usually:

library(tidyverse)
  
df %>% 
  mutate(sentiment = case_when(
    word == "tender" ~ "negative",
    TRUE ~ sentiment # means leave if none of the conditions are met
  ))
#>        word sentiment   n
#> 1      pain  negative 351
#> 2    tender  negative 305
#> 3 sensitive  positive 279
#> 4 headaches  negative 220
#> 5      like  positive 200
#> 6   anxiety  negative 196

case_when follows the same logic as ifelse but you can evaluate as many conditions as you want, making it perfect to recode a number of values. The left side of the ~ evaluates a condition and the right side states the value if this conditions is met. You can set a default as shown in the last line inside case_when.