I'm trying to work with some sentiment analysis but unfortunately stuck on the very beginning, I can't even import the file.
The data is located here: http://snap.stanford.edu/data/web-FineFoods.html
It is a 353MB .txt file and and looks like this:
product/productId: B001E4KFG0
review/userId: A3SGXH7AUHU8GW
review/profileName: delmartian
review/helpfulness: 1/1
review/score: 5.0
review/time: 1303862400
review/summary: Good Quality Dog Food
review/text: I have bought several of the Vitality canned dog food products and have
found them all to be of good quality. The product looks more like a stew than a
processed meat and it smells better. My Labrador is finicky and she appreciates this
product better than most.
My attempts have all thrown this data into a single column and I'm unsure how I should go about sorting these out correctly in order to process them into tidytext.
I would be happy with columns with the headers shown on each of the rows here.
Appreciate any direction.
Here's one way to do it with
dplyr
andtidyr
-