The scenario is like this:
I am trying to make a recommender using apache mahaout and i have some sample preference(user,item,preference value) data for generating the similarity matrix and determining item-item similarities. But the actual preference data is much larger than the sample preference data. The list of item IDs that are present in the actual preference data are all present in the sample preference data as well. But the User ids in sample data are much lesser than the actual data.
Now, when i try to run the recommender on the actual data, it keeps giving me error that user id does not exist because it was not present in the sample data. How can i inject new user ids and their preferences in the recommender of mahout so that it can generate recommendations for any user on the fly based on item-item similarity? Or if there is any other way possible to generated recommendations for a new user, then please suggest.
Thanks.
If you think your sample data is complete for computing the item-item similarities, why don't you precompute them and use
Collection<GenericItemSimilarity.ItemItemSimilarity> corrMatrix = new ArrayList<GenericItemSimilarity.ItemItemSimilarity>();
to store your precomputed similarities. Then from this you can create yourItemSimilarity
like this:ItemSimilarity similarity = new GenericItemSimilarity(correlationMatrix);
I think it is not good idea for using sample of your data for computing item-item similarities based on the preference values, because you might be missing a lot of useful data. If you think that computing it on the fly is slow, you can always precomputed it and store it in a database, and load it when needed.
If you are still getting this error, than you probably use your sample data model in the recommendation class, or you use
UserSimilarity
to compute the item similarities.If you want to add new user you can either use Mahout's
FileDataModel
and update the file periodically by including new users (I think you can create new file with some suffix, I am not sure). You can find more about this in the book Mahout in Action. The in-memoryDataModel
implementations are immutable. You can extend them by implementing the methodssetPreference()
andremovePreference()
.EDIT: I have an implementation for
MutableDataModel
that extends theAbstractDataModel
. I can share it with you if you want.