I've implemented the Apriori Algorithm on the default "Groceries" dataset present in the "arules" package of R.
And already got the desired output in the form of a dataframe showing the Apriori rules and the corresponding support, confidence & lift values as follows:
But is there any way to extract all the row entries corresponding to each of the rules from the main "Groceries" dataset in a separate dataframe?
For instance, if I want to extract all the row entries corresponding to Rule no. 1 - which has a count of "68 rows" as seen in the above output dataframe in the last column, how do we manifest all these row items from the main dataset into a new dataframe?
Is there already some default function for the same or a handy way of doing this?
The following is the code for getting the Apriori Rules output shown above:
## Example: Identifying Frequently-Purchased Groceries ----
## Step 2: Exploring and preparing the data ----
# load the grocery data into a sparse matrix
library(arules)
data("Groceries")
summary(Groceries)
# look at the first five transactions
inspect(Groceries[1:5])
# examine the frequency of items
itemFrequency(Groceries[, 1:3])
# plot the frequency of items
itemFrequencyPlot(Groceries, support = 0.1)
itemFrequencyPlot(Groceries, topN = 20)
# a visualization of the sparse matrix for the first five transactions
image(Groceries[1:5])
# visualization of a random sample of 100 transactions
image(sample(Groceries, 100))
## Step 3: Training a model on the data ----
library(arules)
# default settings result in zero rules learned
apriori(Groceries)
# set better support and confidence levels to learn more rules
groceryrules <- apriori(Groceries, parameter = list(support =
0.006, confidence = 0.25, minlen = 2))
groceryrules
## Step 4: Evaluating model performance ----
# summary of grocery association rules
summary(groceryrules)
# look at the first three rules
inspect(groceryrules[1:3])
## Step 5: Improving model performance ----
# sorting grocery rules by lift
inspect(sort(groceryrules, by = "lift")[1:5])
# finding subsets of rules containing any berry items
berryrules <- subset(groceryrules, items %in% "berries")
inspect(berryrules)
# writing the rules to a CSV file
write(groceryrules, file = "groceryrules.csv",
sep = ",", quote = TRUE, row.names = FALSE)
# converting the rule set to a data frame
groceryrules_df <- as(groceryrules, "data.frame")
str(groceryrules_df)