I am planning to do text analysis in R just as sentiment analysis with an own custom dictionary following a "trade" versus "law" logic.
I have all the required words for the dictionary in an excel file. Looks like this:
> % 1 Trade 2 Law % business 1 exchange 1 industry 1 rule 2
> settlement 2 umpire 2 court 2 tribunal 2 lawsuit 2 bench 2
> courthouse 2 courtroom 2
What steps do I have to pursue in order to transform this in an R-suitable format and apply it to my text corpus?
Thank you for your help!
Create a data.frame with 2 columns and store this somewhere, either as an rds, a database object or in excel. So you can load it everytime when needed.
Once you have the data in a data.frame you can use joins /dictionaries to match it to the words in your text corpus. In the scoring data.frame I used 1 and 2 to represent the sectors, but you can use words as well.
See example using tidytext, but read up on sentiment analyses and use whatever package you need to.
Data: