I'm trying to set up learning to rank with lightgbm, I have the following dataset with the interactions of the users based on the query:
df = pd.DataFrame({'QueryID': [1, 1, 1, 2, 2, 2],
'ItemID': [1, 2, 3, 1, 2, 3],
'Position': [1, 2 , 3, 1, 2, 3],
'Interaction': ['CLICK', 'VIEW', 'BOOK', 'BOOK', 'CLICK', 'VIEW']})
The question is to properly set up the dataset for training? The docs mention using Dataset.set_group() but it's not very clear how.
Before converting this data to a group. You have to create a score variable i.e. dependent variable and then generate a train and test file. On the top of it, you need to create two group files for both train and test(Which is looking for the number of times same qid i.e. QueryID is been used.)
Go through this article for more references: https://medium.com/@tacucumides/learning-to-rank-with-lightgbm-code-example-in-python-843bd7b44574