How to keep transaction number related to association rules created in mlxtend python?

190 Views Asked by At

The transaction numbers related with the frequent itemsets created are not kept after using the apriori method in mlxtend. They are dropped.

How can i keep the transaction numbers?

df = pd.read_csv('association_rule_items_fullmech.csv')

basket = (df.groupby(['transaction_doc', 'text'])['transaction_doc'].sum().unstack().reset_index().fillna(0).set_index('transaction_doc'))


# one-hot encoding
def encode_units(x):
if x <= 0:
    return 0
elif x >= 1:
    return 1

basket_sets = basket.applymap(encode_units)

baskets_sets dataframe essentially looks like this (this is just an mini arbitrary example but the same structure):

transaction_doc - text "string1" "string2" "string3"
0 1 0 1
1 1 1 0
2 0 0 1

i then apply the apriori function

frequent_itemsets = apriori(basket_sets, min_support=0.001, use_colnames=True)

however after this apriori funcion, the transaction_doc, which is where the indicator of which document the text comes from, disappears from the the idx column. I get a reseted index column with the frequent itemsets. I want to be able to retain the transaction_doc column after the apriori function is applied.

0

There are 0 best solutions below