R- Text Analysis- Print specific text that contain a bi-gram

101 Views Asked by At

Text analysis with R.

My dataset is 2000 comments from 2000 different surveys. I have created Bi-grams. I have checked frequecy of words, then word cluster analysis with hclust(), then Word association with findAssocs, for example, findAssocs(bigram_dtm,"long time",0.2).

For example, I am seeing that "long time" has an association of 0.66 with " felt waiting".

I have tried to find it online but not success yet... Questions: Is there any way I can print comments where this bi_grams are together? Is there any way I can print comments where "long time" are?

Thanks,

1

There are 1 best solutions below

0
G5W On

I think that what you are looking for is grep. You can use it to get the indices of the comments you are looking for or use those indices to get at the comments themselves.

Comments = c("I haven't seen you in a long time.",
    "There is no U in TEAM, but it does contain ME.",
    "In extreme cases, read the documentation.",
    "A big computer, a complex algorithm and a long time does not equal science.",
    "Use the source, Luke!")

grep("long time", Comments)
[1] 1 4
Comments[grep("long time", Comments)]
[1] "I haven't seen you in a long time."                                         
[2] "A big computer, a complex algorithm and a long time does not equal science."

( Some comments stolen from fortune() )