I want to apply Apriori Algorithm to the retail dataset (market basket data from retail store). It has data in the form:-
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
30 31 32
33 34 35
36 37 38 39 40 41 42 43 44 45 46
38 39 47 48
38 39 48 49 50 51 52 53 54 55 56 57 58
32 41 59 60 61 62
3 39 48
So, in order to use Apriori algorithm I need the data in the form of a Python list of lists into a Numpy array as:-
Column Names as 0 1 2 3 4 5 6 7 8 9 10........
Dataset as:
0 1 2 3 4 5 6 7 8 9 10 .........30 31 32 33 34 35....
1 1 1 1 1 1 1 1 1 1 1...........0 0 0 0 0 0...
0 0 0 0 0 0 0 0 0 0 0...........1 1 1 0 0 0..
and so on..
For this I am trying to use Transaction Encoder:-
dataset = pd.read_csv('retail.dat', header=None)
from mlxtend.preprocessing import TransactionEncoder
transactionEncoder = TransactionEncoder()
dataset = transactionEncoder.fit(dataset).transform(dataset)
dataset.astype('int')
print(dataset)
But I am getting the error:-
TypeError: 'int' object is not iterable
I also want to attach column names as 0 1 2.... to the newly formed dataset, but print(transactionEncoder.columns_)
not giving valid columns. Please tell what can be the issue and what is the correct method to apply Transaction Encoder on this dataset...
IIUC, you can stack the dataframe and try
crosstab
:Output: