Why are these stop words not being removed from my data?

383 Views Asked by Scot Garrison At 06 June 2025 at 10:55

Tokenization of the data

tidy_text <- data %>% 
  unnest_tokens(word, q_content)

Removal of stop words

data("stop_words")
stop_words
tidy_text <- tidy_text %>% anti_join(stop_words, by ="word")
tidy_text %>% count(word, sort = TRUE)

Output including most important 10 words

1                                                                                   im 13012
2                                                                                 dont 11197
3                                                                                 feel  9168
4                                                                                 time  6697
5                                                                                 life  4464
6                                                                                  ive  4403
7                                                                               people  4233
8                                                                                 told  4150
9                                                                              friends  4045
10                                                                                love  3281

Original Q&A

There are 1 best solutions below

Ronak Shah On 16 April 2021 at 07:47

As explained by @Maurits Evers, the words in your data and stop_words do not exactly match. You may remove ' from the words in stop_words before joining them. Try :

library(dplyr)

tidy_text <- tidy_text %>% 
              anti_join(stop_words %>%
                          mutate(word = gsub("'", "", word)), by ="word")

tidy_text %>% count(word, sort = TRUE)

Why are these stop words not being removed from my data?

There are 1 best solutions below

Related Questions in R

Related Questions in TEXT

Related Questions in TIDYVERSE

Related Questions in STOP-WORDS

Related Questions in TIDYTEXT

Trending Questions

Popular # Hahtags

Popular Questions