I am trying to calculate relevance score using a review from a json file. Every time I tried to run my code, it will only say "indirect" for output. What am I doing wrong?
My code is below:
import joblib, requests, json, sklearn.metrics, sklearn.model_selection, sklearn.tree, time, math, textblob
import warnings
warnings.filterwarnings("ignore")
response = requests.get("https://appliance_reviews.json")
if response:
data = json.loads(response.text)
unique = []
word = []
for line in data:
#print(line)
review = line["Review"]
blob = textblob.TextBlob(review)
for word in blob.words:
if word.lower() not in unique:
unique.append(word.lower())
for word in unique:
a = 0
b = 0
c = 0
d = 0
for line in data:
review = line["Review"]
safety = line["Safety hazard"]
if word in review.lower() and safety == 1:
a += 1
if word in review.lower() and safety == 0:
b += 1
if word in review.lower() and safety == 1:
c += 1
if word in review.lower() and safety == 0:
d += 1
try:
rel_score = (math.sqrt(a + b + c + d) * ((a + d) - (c * b))) / math.sqrt((a + b) * (c + d))
except:
rel_score = 0
if rel_score >= 4000:
score.append(word)
print(word)
wordwould just be the last entry inuniqueat the time you print it on the last line of code given, regardless of its scoring. You've just exited aforloop wherewordwas the iterating variable.Are you sure that you didn't want to print
score, which seems to be intended to accumulate high-scoring words fromunique?Also I think your scoring is broken. For example as coded,
aandcare always equal, as arebandd. "carpet" would affect the score of both "car", "pet" and indeed "carp".As Prune mentions in comments, your bland choice of variable names makes understanding the purpose of the code difficult.