I am trying to solve a question with Cypher in Neo4J where I need to find the BEER_STYLE that has the best combination of smell and looks. The relations are as follows:
(:REVIEW)<-[:REVIEWED]-(:BEER)-[:HAS_STYLE]->(:BEER_STYLE)
One beer sometimes has more than one review and the Reviews nodes have the two properties of interest: look and smell. Sometimes these 2 can be NaN and I have to find the style that has the best combination of those 2 (which I thought it could be the average of averages).
I tried to code a general idea and asked ChatGPT to polish until it works, but I'm not very confident and wanted to ask here. This is the code I got, can someone help/correct me to improve it?
MATCH (s:STYLE)<-[h:HAS_STYLE]-(b:BEERS)-[rev:REVIEWED]->(r:REVIEWS)
WITH s,
CASE WHEN r.look <> 'NaN' THEN toFloat(r.look) ELSE null END AS look,
CASE WHEN r.smell <> 'NaN' THEN toFloat(r.smell) ELSE null END AS smell
WITH s, AVG(look) AS avg_look, AVG(smell) AS avg_smell
WITH s, avg_look, avg_smell, ((avg_look + avg_smell) / 2) AS total
RETURN s AS STYLE, MAX(total) AS Final_Class
ORDER BY Final_Class DESC
LIMIT 2
Preprocess to convert strings to floats
Assuming that the
lookandsmellvalues do not need to be stored as strings, you should first change them to floats to avoid having to do conversions every time you use those properties:Notes:
NULLis the default value forCASE, so it does not need to be specified in anELSE.NULLto a property, the property is removed.Simplified query
After the above one-time change, your use case can be simplified to:
Notes:
Your original query did not need to call the aggregating function
MAXon thetotalvalue for eachs, since there would only be a singletotalpers. This is because the aggregating functionAVGin the following clause would create only a singleavg_lookandavg_smellpers: