I am trying to create a recommendation system with Python and a very small data, in this case, I have more users than items (SD)
SD user value
29 Hutsa kafeinagabea jolaizolaa 32
23 Hurradun Capuccinoa ibaraia 117
37 Kafe Hutsa ivelez 1
21 Hurradun Capuccinoa aalonso 7
67 Txokolatea dreguera 2
18 Ebakia kafeinagabea dreguera 1
44 Kafe Luzea ezugasti 17
17 Ebakia xesnaola 67
71 Txokolatea ivelez 88
40 Kafe Hutsa miturbe 1
I create the sparse matrix using the data of my data set
grouped_df['SD'] = grouped_df['SD'].astype("category")
grouped_df['user'] = grouped_df['user'].astype("category")
grouped_df['SD_id'] = grouped_df['SD'].cat.codes
grouped_df['user_id'] = grouped_df['user'].cat.codes
sparse_content_person = sparse.csr_matrix((grouped_df['value'].astype(float), (grouped_df['SD_id'], grouped_df['user_id'])))
then I train my model
model = implicit.als.AlternatingLeastSquares(factors=10, regularization=0.1, iterations=50)
alpha = 1
data = (sparse_content_person * alpha).astype('double')
model.fit(data)
Then I try to get the items that are similar to code/index 11
model.similar_items(11)
and I get the following result:
(array([11, 2, 19, 9, 3, 23, 20, 6, 13, 24], dtype=int32),
array([1. , 1. , 0.9999995 , 0.8310417 , 0.8063449 ,
0.7573321 , 0.6772214 , 0.6740392 , 0.4807193 , 0.47637755],
dtype=float32))
These indices don't exist as item codes, since I only have 16 items. I could get the product index from the sparse_content_person matrix, but I don't know if I am correct. I am following a tutorial and they get the info directly from the dataframe (tutorial link: https://towardsdatascience.com/building-a-collaborative-filtering-recommender-system-with-clickstream-data-dffc86c8c65).