AWS Personalize remove "viewed" items from recommendation

193 Views Asked by At

We have a Reddit style app where users browse an endless feed of images. In each session, users can view anywhere from 500-1000 images.

We're using Personalize and we need to filter out any items that have been viewed by users previously. Unfortunately Personalize only uses the last 100 user interactions, so after a while the same items will be recommended over and over again. Especially if a user is having a longer session where they browse 500+ images in one go, most of the recommended items will be the same.

From the docs:

Amazon Personalize considers up to 100 of the most recent interactions per user per event type. This is an adjustable quota.

What's the best way to handle this type of use case? Is there some architecture pattern we can follow for this scenario? Seems like a common social media scenario, but the current restriction on 100 events per filter is pretty limiting.

Should we ask for a quota increase? Not sure how much more expensive this will be plus the docs are not clear on what the max number of interactions we could request is. Is it 500, 5,000, 50,000+ or even more? Does the aws bill increase exponentially the more historical interactions filters can use?

Any guidance here would be super helpful! Thanks!

We've tried using just regular filters but it breaks down after 100+ viewed items per session.

We've also thought about bucketing view events, ex view_1-100, view_100-200, view_200-300 and being clever about how we store events in each bucket, but this seems hacky and inefficient.

2

There are 2 best solutions below

2
James J On

Requesting a quota increase is the easiest and cleanest solution. You should be able to accommodate the 500-1000 image scenario within the upper limit with a single view event type. There is not any additional cost for filtering so that should not be a consideration.

Bucketing view events by event type will only get you so far as the hard limit on event types is 10 (assuming you don't have any interaction metadata fields). Albeit a bit hacky, it could be used to increase your effective maximum. Assuming view was your only event type, your filter could use a wildcard.

EXCLUDE ItemID WHERE Interactions.EVENT_TYPE IN (*)

Either way, requesting a quota increase will give you more interactions to filter.

0
Benedikt Lueth On

We are also thinking about using Amazon Personalize for our social media feed. My strategy would be to only query, say, 20 posts when requesting the recommendation. By streaming the viewed events to Personalize in real time, the next 20 posts would then already be filtered accordingly. This way, as I see it, you could avoid the problem that with 500 queried posts only the last 100 interactions are taken into account and at the same time also achieve pagination.