How to evaluate content based recommendation system

722 Views Asked by At

I have created a content based recommender, which will recommend 10 similar products based on their description. Now I want to evaluate its accuracy and efficiency. Everything works well till now when I want to evaluate the accuracy of the system. Some formulas that I found on Google just evaluate the accuracy based on rating values (comparing predicted rating and actual rating like RMSE). I did not change similarity score into rating (scale from 1 to 5) so I couldn't apply any formula.

I have used cosine similarity and tfidf vectorizer. When I used surprise for cross validation "no raw rating" error occurred. I need some parameter to evaluate the recommendation system accuracy and efficiency.
Code for tfidf:

from sklearn.feature_extraction.text import TfidfVectorizer

##remove stop words
tfidf=TfidfVectorizer(stop_words='english')

###replace non with empty string NaN
df1['product_desc']=df1['product_desc'].fillna('')

##construct tfidf
tfidf_matrix=tfidf.fit_transform(df1['product_desc'])
tfidf_matrix.shape

and cosine similiarity

###cosine similarity
from sklearn.metrics.pairwise import linear_kernel
cosine_sim=linear_kernel(tfidf_matrix, tfidf_matrix)

###create reverse map

indices=pd.Series(df1.index,index=df1['product_name']).drop_duplicates()

def get_recommendation(title, cosine_sim=cosine_sim):
    idx=indices[title]
    sim_score=list(enumerate(cosine_sim[idx]))
    sim_score=sorted(sim_score, key=lambda x:x[1], reverse=True)
    sim_score=sim_score[1:2]
    p_indices=[i[0] for i in sim_score]
    name=df1['product_name'].iloc[p_indices]
    return idx

so need some formula to evaluate content based recommender

0

There are 0 best solutions below