How to extract row entries of the basket items of the data after generating rules by Apriori in R?

337 Views Asked by At

I've implemented the Apriori Algorithm on the default "Groceries" dataset present in the "arules" package of R.

And already got the desired output in the form of a dataframe showing the Apriori rules and the corresponding support, confidence & lift values as follows: enter image description here

But is there any way to extract all the row entries corresponding to each of the rules from the main "Groceries" dataset in a separate dataframe?

For instance, if I want to extract all the row entries corresponding to Rule no. 1 - which has a count of "68 rows" as seen in the above output dataframe in the last column, how do we manifest all these row items from the main dataset into a new dataframe?

Is there already some default function for the same or a handy way of doing this?

The following is the code for getting the Apriori Rules output shown above:

## Example: Identifying Frequently-Purchased Groceries ----
## Step 2: Exploring and preparing the data ----

# load the grocery data into a sparse matrix
library(arules)
data("Groceries")
summary(Groceries)

# look at the first five transactions
inspect(Groceries[1:5])

# examine the frequency of items
itemFrequency(Groceries[, 1:3])

# plot the frequency of items
itemFrequencyPlot(Groceries, support = 0.1)
itemFrequencyPlot(Groceries, topN = 20)

# a visualization of the sparse matrix for the first five transactions
image(Groceries[1:5])

# visualization of a random sample of 100 transactions
image(sample(Groceries, 100))

## Step 3: Training a model on the data ----
library(arules)

# default settings result in zero rules learned
apriori(Groceries)

# set better support and confidence levels to learn more rules
groceryrules <- apriori(Groceries, parameter = list(support =
                          0.006, confidence = 0.25, minlen = 2))
groceryrules

## Step 4: Evaluating model performance ----
# summary of grocery association rules
summary(groceryrules)

# look at the first three rules
inspect(groceryrules[1:3])

## Step 5: Improving model performance ----

# sorting grocery rules by lift
inspect(sort(groceryrules, by = "lift")[1:5])

# finding subsets of rules containing any berry items
berryrules <- subset(groceryrules, items %in% "berries")
inspect(berryrules)

# writing the rules to a CSV file
write(groceryrules, file = "groceryrules.csv",
      sep = ",", quote = TRUE, row.names = FALSE)

# converting the rule set to a data frame
groceryrules_df <- as(groceryrules, "data.frame")
str(groceryrules_df)

0

There are 0 best solutions below