Displaying top 5 keywords for every category using group_by function

Question

Displaying top 5 keywords for every category using group_by function

55 Views Asked by kartik trivedi At 17 August 2025 at 08:05

I am trying to find top 5 keywords in reviews for every category of products for which I have the following code

# Group by category and count keyword frequencies
keyword_counts <- filtered_data %>%
  group_by(category, keyword) %>%
  summarise(n = n()) %>%
  arrange(desc(n))

# Find the top 5 keywords in each category
top_keywords_by_category <- keyword_counts %>%
  group_by(category) %>%
  top_n(5, wt = n) %>%
  ungroup()  # Ungroup the data

# Print the table
print(top_keywords_by_category)

which is providing this output

category                                                        keyword     n
   <chr>                                                           <chr>   <int>
 1 Computers&Accessories|Accessories&Peripherals|Cables&Accessori… product   354
 2 Computers&Accessories|Accessories&Peripherals|Cables&Accessori… cable     277
 3 Computers&Accessories|Accessories&Peripherals|Cables&Accessori… chargi…   200
 4 Computers&Accessories|Accessories&Peripherals|Cables&Accessori… quality   179
 5 Computers&Accessories|Accessories&Peripherals|Cables&Accessori… nice      147
 6 Electronics|WearableTechnology|SmartWatches                     watch     129
 7 Electronics|Mobiles&Accessories|Smartphones&BasicMobiles|Smart… phone     127
 8 Electronics|HomeTheater,TV&Video|Televisions|SmartTelevisions   tv        117
 9 Electronics|WearableTechnology|SmartWatches                     product   102
10 Electronics|HomeTheater,TV&Video|Televisions|SmartTelevisions   product    80

While I want the result as

Category Computers&Accessories
Keyword             n
1 Product          354
2 Cable            277
3 Chargi...        200
4 Quality          179
5 Nice             147

Original Q&A

There are 1 best solutions below

**r2evans** · Answer 1

While this data is uninteresting, it should show you how to use tidyr::separate_rows.

quux <- structure(list(category = c("Computers&Accessories|Accessories&Peripherals|Cables&Accessori…", "Computers&Accessories|Accessories&Peripherals|Cables&Accessori…", "Computers&Accessories|Accessories&Peripherals|Cables&Accessori…", "Computers&Accessories|Accessories&Peripherals|Cables&Accessori…", "Computers&Accessories|Accessories&Peripherals|Cables&Accessori…", "Electronics|WearableTechnology|SmartWatches", "Electronics|Mobiles&Accessories|Smartphones&BasicMobiles|Smart…", "Electronics|HomeTheater,TV&Video|Televisions|SmartTelevisions",  "Electronics|WearableTechnology|SmartWatches", "Electronics|HomeTheater,TV&Video|Televisions|SmartTelevisions"),
                       keyword = c("product", "cable", "chargi…", "quality", "nice", "watch", "phone", "tv", "product", "product"), 
                       n = c(354L, 277L, 200L, 179L, 147L, 129L, 127L, 117L, 102L, 80L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))

library(dplyr)
quux %>%
  tidyr::separate_rows(category, sep = "\\|") %>%
  count(category, keyword) %>%
  arrange(desc(n))
# # A tibble: 32 × 3
#    category                keyword     n
#    <chr>                   <chr>   <int>
#  1 Electronics             product     2
#  2 Accessories&Peripherals cable       1
#  3 Accessories&Peripherals chargi…     1
#  4 Accessories&Peripherals nice        1
#  5 Accessories&Peripherals product     1
#  6 Accessories&Peripherals quality     1
#  7 Cables&Accessori…       cable       1
#  8 Cables&Accessori…       chargi…     1
#  9 Cables&Accessori…       nice        1
# 10 Cables&Accessori…       product     1
# # ℹ 22 more rows
# # ℹ Use `print(n = ...)` to see more rows

From here, you can do your top-5 filtering and pivoting:

quux %>%
  tidyr::separate_rows(category, sep = "\\|") %>%
  count(category, keyword) %>%
  slice_max(n = 5, order_by = n, with_ties = FALSE) %>%
  tidyr::pivot_wider(names_from = category, values_from = n, values_fill = list(n = 0))
# # A tibble: 4 × 3
#   keyword Electronics `Accessories&Peripherals`
#   <chr>         <int>                     <int>
# 1 product           2                         1
# 2 cable             0                         1
# 3 chargi…           0                         1
# 4 nice              0                         1

Displaying top 5 keywords for every category using group_by function

There are 1 best solutions below

Related Questions in R

Related Questions in DPLYR

Related Questions in GROUP-BY

Related Questions in TOKENIZE

Related Questions in SUMMARIZE

Trending Questions

Popular # Hahtags

Popular Questions