I am trying to find top 5 keywords in reviews for every category of products for which I have the following code
# Group by category and count keyword frequencies
keyword_counts <- filtered_data %>%
group_by(category, keyword) %>%
summarise(n = n()) %>%
arrange(desc(n))
# Find the top 5 keywords in each category
top_keywords_by_category <- keyword_counts %>%
group_by(category) %>%
top_n(5, wt = n) %>%
ungroup() # Ungroup the data
# Print the table
print(top_keywords_by_category)
which is providing this output
category keyword n
<chr> <chr> <int>
1 Computers&Accessories|Accessories&Peripherals|Cables&Accessori… product 354
2 Computers&Accessories|Accessories&Peripherals|Cables&Accessori… cable 277
3 Computers&Accessories|Accessories&Peripherals|Cables&Accessori… chargi… 200
4 Computers&Accessories|Accessories&Peripherals|Cables&Accessori… quality 179
5 Computers&Accessories|Accessories&Peripherals|Cables&Accessori… nice 147
6 Electronics|WearableTechnology|SmartWatches watch 129
7 Electronics|Mobiles&Accessories|Smartphones&BasicMobiles|Smart… phone 127
8 Electronics|HomeTheater,TV&Video|Televisions|SmartTelevisions tv 117
9 Electronics|WearableTechnology|SmartWatches product 102
10 Electronics|HomeTheater,TV&Video|Televisions|SmartTelevisions product 80
While I want the result as
Category Computers&Accessories
Keyword n
1 Product 354
2 Cable 277
3 Chargi... 200
4 Quality 179
5 Nice 147
While this data is uninteresting, it should show you how to use
tidyr::separate_rows
.From here, you can do your top-5 filtering and pivoting: