Best Solution for Recommendation

507 Views Asked by At

I am going to find a appropriate function in order to obtain accurate similarity between two persons according to their favourites.

for instance persons are connected to tags and their desire to each tags will be kept on the edge of tag nodes as a numeric values. I want to recommend similar persons to each persons.

I have found two solutions:

  1. Cosine Similarity similarity

There is Cosine function in Neo4j that just accept one input while in above function I need to pass vectores to this formula. Such as:

for "a": a=[10, 20, 45] each number indicates person`s desire to each tag. for "b": b=[20, 50, 70]

  1. Pearson Correlation correlation

When I was surfing on the net and your documentation I found: http://neo4j.com/docs/stable/cypher-cookbook-similarity-calc.html#cookbook-calculate-similarities-by-complex-calculations

neo4j

My question is what is your logic behind this formula? What is difference between r and H?

Because at the first glance I think H1 or H2 are always equals one. Unless I should consider the rest of the graph.

Thank you in advanced for any helps.

3

There are 3 best solutions below

1
On BEST ANSWER

I think the purpose of H1 and H2 are to normalize the results of the times property (the number of times the user ate the food) across food types. You can experiment with this example in this Neo4j console

Since you mention other similarity measures you might be interested in this GraphGist, Similarity Measures For Collaborative Filtering With Cypher. It has some simple examples of calculating Pearson correlation and Jaccard similarity using Cypher.

0
On

This example makes it a little bit hard to understand what is going on. In this example, H1 and H2 are both 1. a better example would show each person eating different types of food, so you'd be able to see the value of H changing. If "me" also ate "vegetables", "pizza", and "hotdogs", their H would be 4.

1
On

Can't help you with Neo4J, just want to point out that Cosine Similarity and Pearsons' correlation coefficient are essentially the same thing. If you decode the different notations, you'll find that the only difference is that Pearsons zero-centers the vectors first. So you can define Pearsons as follows:

Pearsons(a, b) = Cosine(a - mean(a), b - mean(b))