I am making a Rasa chatbot that recommend movies ,with a dataset of 10,000 movies. there is 4 entities in the nlu and 4 slots for every one of them. the bot askes the user about the movie genre and the user is expected to answer something like (action,fantacy, ...) and in the actions file there is a preprocessing for the user input then the bot should ask about actors the user want to see in the movie, director and finaly the keywords. so every entity is saved in it's slot after that the bot should use these slots and put them in a list which will be added to the database and then use cosine similarty to compare the list with every row in the dataset and sort the output of the cosine similarty of every movie in the dataset and then print the first 10 movies that have the most similarity. the thing is i run the actions file on jupyter notebook and everything is fine, but when i try to put the code in the actions file it does not work.
and this is the database i am using in the codes: https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset
i hope someone help me fast and thank you all
#thats the code that run on jupyter notebook so smoothly
#and i want to run on Rasa actions file
import pandas as pd
import numpy as np
from ast import literal_eval
metadata = pd.read_csv("/Users/Mohamad Najeeb/Downloads/database_for_movies/movies_metadata.csv")
ratings = pd.read_csv("/Users/Mohamad Najeeb/Downloads/database_for_movies/ratings.csv")
credits = pd.read_csv("/Users/Mohamad Najeeb/Downloads/database_for_movies/credits.csv")
keywords = pd.read_csv("/Users/Mohamad Najeeb/Downloads/database_for_movies/keywords.csv")
metadata = metadata.iloc[0:10000,:]
keywords['id'] = keywords['id'].astype('int')
credits['id'] = credits['id'].astype('int')
metadata['id'] = metadata['id'].astype('int')
metadata = metadata.merge(credits, on='id')
metadata = metadata.merge(keywords, on='id')
metadata.shape
features = ['cast', 'crew', 'keywords', 'genres']
for feature in features:
metadata[feature] = metadata[feature].apply(literal_eval)
def get_director(x):
for i in x:
if i['job'] == 'Director':
return i['name']
return np.nan
def get_list(x):
if isinstance(x, list):
names = [i['name'] for i in x]
if len(names) > 3:
names = names[:3]
return names
return []
metadata['director'] = metadata['crew'].apply(get_director)
features = ['cast', 'keywords', 'genres']
for feature in features:
metadata[feature] = metadata[feature].apply(get_list)
def clean_data(x):
if isinstance(x, list):
return [str.lower(i.replace(" ", "")) for i in x]
else:
if isinstance(x, str):
return str.lower(x.replace(" ", ""))
else:
return ''
features = ['cast', 'keywords', 'director', 'genres']
for feature in features:
metadata[feature] = metadata[feature].apply(clean_data)
def create_soup(x):
return ' '.join(x['keywords']) + ' ' + ' '.join(x['cast']) + ' ' + x['director'] + ' ' + ' '.join(x['genres'])
metadata['soup'] = metadata.apply(create_soup, axis=1)
f= ['title', 'soup', 'cast', 'director', 'keywords', 'genres']
def get_genres():
genres = input("What Movie Genre are you interested in (if multiple, please separate them with a comma)? [Type 'skip' to skip this question] ")
genres = " ".join(["".join(n.split()) for n in genres.lower().split(',')])
return genres
def get_actors():
actors = input("Who are some actors within the genre that you love (if multiple, please separate them with a comma)? [Type 'skip' to skip this question] ")
actors = " ".join(["".join(n.split()) for n in actors.lower().split(',')])
return actors
def get_directors():
directors = input("Who are some directors within the genre that you love (if multiple, please separate them with a comma)? [Type 'skip' to skip this question] ")
directors = " ".join(["".join(n.split()) for n in directors.lower().split(',')])
return directors
def get_keywords():
keywords = input("What are some of the keywords that describe the movie you want to watch, like elements of the plot, whether or not it is about friendship, etc? (if multiple, please separate them with a comma)? [Type 'skip' to skip this question] ")
keywords = " ".join(["".join(n.split()) for n in keywords.lower().split(',')])
return keywords
def get_searchTerms():
searchTerms = []
genres = get_genres()
if genres != 'skip':
searchTerms.append(genres)
actors = get_actors()
if actors != 'skip':
searchTerms.append(actors)
directors = get_directors()
if directors != 'skip':
searchTerms.append(directors)
keywords = get_keywords()
if keywords != 'skip':
searchTerms.append(keywords)
return searchTerms
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
def make_recommendation(metadata=metadata):
new_row = metadata.iloc[-1,:].copy()
searchTerms = get_searchTerms()
new_row.iloc[-1] = " ".join(searchTerms)
metadata = metadata.append(new_row)
count = CountVectorizer(stop_words='english')
count_matrix = count.fit_transform(metadata['soup'])
cosine_sim2 = cosine_similarity(count_matrix, count_matrix)
sim_scores = list(enumerate(cosine_sim2[-1,:]))
sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
ranked_titles = []
for i in range(1, 11):
indx = sim_scores[i][0]
ranked_titles.append([metadata['title'].iloc[indx], metadata['imdb_id'].iloc[indx]])
return ranked_titles
make_recommendation()
#here is me trying to run it on Rasa actions #but it does not work as i expected
from typing import Any, Text, Dict, List from rasa_sdk import Action, Tracker from rasa_sdk.executor import CollectingDispatcher
import pandas as pd
import numpy as np
from ast import literal_eval
from sklearn.feature_extraction.text
import CountVectorizer from sklearn.metrics.pairwise
import cosine_similarity
metadata = pd.read_csv("data/movies_metadata.csv")
ratings = pd.read_csv("data/ratings.csv")
credits = pd.read_csv("data/credits.csv")
keywords = pd.read_csv("data/keywords.csv")
metadata = metadata.iloc[0:10000,:]
keywords['id'] = keywords['id'].astype('int')
credits['id'] = credits['id'].astype('int')
metadata['id'] = metadata['id'].astype('int')
metadata = metadata.merge(credits, on='id')
metadata = metadata.merge(keywords, on='id')
features = ['cast', 'crew', 'keywords', 'genres']
for feature in features:
metadata\[feature\] = metadata\[feature\].apply(literal_eval)
def get_director(x):
for i in x:
if i['job'] == 'Director':
return i\['name'\]
return np.nan
def get_list(x):
if isinstance(x, list):
names = \[i\['name'\] for i in x\]
if len(names) > 3:
names = names[:3]
return names
return []
metadata['director'] = metadata['crew'].apply(get_director)
features = ['cast', 'keywords', 'genres']
for feature in features:
metadata\[feature\] = metadata\[feature\].apply(get_list)
def clean_data(x):
if isinstance(x, list):
return [str.lower(i.replace(" ", "")) for i in x] else:
if isinstance(x, str):
return str.lower(x.replace(" ", ""))
else:
return ''
features = ['cast', 'keywords', 'director', 'genres']
for feature in features:
metadata\[feature\] = metadata\[feature\].apply(clean_data)
def create_soup(x):
return ' '.join(x\['keywords'\]) + ' ' + ' '.join(x\['cast'\]) + ' ' + x\['director'\] + ' ' + ' '.join(x\['genres'\])
metadata['soup'] = metadata.apply(create_soup, axis=1)
class ActionMakeRecommendation(Action):
def name(self) -\> Text:
return
"action_make_recommendation"
def run(self,
dispatcher: CollectingDispatcher,
tracker: Tracker,
domain: Dict[Text, Any]
) -> List[Dict[Text, Any]]:
searchtTerms= []
#getting the features
genres = " ".join(["".join(n.split()) for n in tracker.get_slot('genres').lower().split(',')])
searchTerms.append(genres)
actors = " ".join(["".join(n.split()) for n in tracker.get_slot('actors').lower().split(',')])
searchTerms.append(actors)
directors = " ".join(["".join(n.split()) for n in tracker.get_slot('directors').lower().split(',')])
searchTerms.append(directors)
keywords = " ".join(["".join(n.split()) for n in tracker.get_slot('keywords').lower().split(',')])
searchTerms.append(keywords)
new_row = metadata.iloc[-1,:].copy()
searchTerms = get_searchTerms()
new_row.iloc[-1] = " ".join(searchTerms)
metadata = metadata.append(new_row)
count = CountVectorizer(stop_words='english')
count_matrix = count.fit_transform(metadata['soup'])
cosine_sim2 = cosine_similarity(count_matrix, count_matrix)
sim_scores = list(enumerate(cosine_sim2[-1,:]))
sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
ranked_titles = []
for i in range(1, 11):
indx = sim_scores[i][0]
ranked_titles.append([metadata['title'].iloc[indx], metadata['imdb_id'].iloc[indx]])
dispatcher.utter_message(ranked_title)
`
#this is the domain file
version: "3.1"
intents:
- greet
- goodbye
- affirm
- deny
- mood_great
- mood_unhappy
- bot_challenge
- recommend_movies
- genres: use_entities: true
- actors: use_entities: true
- directors: use_entities: true
- keywords: use_entities: true
entities:
- genre
- actor
- director
- keyword
actions:
- action_make_recommendation
slots: genres: type: text mappings: - type: from_entity entity: genre
actors: type: text mappings: - type: from_entity entity: actor
directors: type: text mappings: - type: from_entity entity: director
keywords: type: text mappings: - type: from_entity entity: keyword
responses: utter_greet:
- text: "Hey! How are you?"
utter_cheer_up:
- text: "Here is something to cheer you up:" image: "https://i.imgur.com/nGF1K8f.jpg"
utter_did_that_help:
- text: "Did that help you?"
utter_happy:
- text: "Great, carry on!"
utter_goodbye:
- text: "Bye"
utter_iamabot:
- text: "I am a bot, powered by Rasa."
utter_genres_question:
- text: "What Movie Genre are you interested in??"
utter_actors_question:
- text: "Who are some actors within the genre that you love?"
utter_directors_question:
- text: "Who are some directors within the genre that you love?"
utter_keywords_question:
- text: "what keywords do you want to see in the movie?"
session_config: session_expiration_time: 0.0 carry_over_slots_to_new_session: true
''' #this is the nlu file
version: "3.1"
nlu:
intent: greet examples: |
- hey
- hello
- hi
- hello there
- good morning
- good evening
- moin
- hey there
- let's go
- hey dude
- goodmorning
- goodevening
- good afternoon
intent: goodbye examples: |
- cu
- good by
- cee you later
- good night
- bye
- goodbye
- have a nice day
- see you around
- bye bye
- see you later
intent: affirm examples: |
- yes
- y
- indeed
- of course
- that sounds good
- correct
- okay
- ok
intent: deny examples: |
- no
- n
- never
- I don't think so
- don't like that
- no way
- not really
intent: mood_great examples: |
- perfect
- great
- amazing
- feeling like a king
- wonderful
- I am feeling very good
- I am great
- I am amazing
- I am going to save the world
- super stoked
- extremely good
- so so perfect
- so good
- so perfect
intent: mood_unhappy examples: |
- my day was horrible
- I am sad
- I don't feel very well
- I am disappointed
- super sad
- I'm so sad
- sad
- very sad
- unhappy
- not good
- not very good
- extremly sad
- so saad
- so sad
intent: bot_challenge examples: |
- are you a bot?
- are you a human?
- am I talking to a bot?
- am I talking to a human?
intent: recommend_movies examples: |
- recommend me a movie
- i want to watch a movie
- can you recommend me a movie
intent: genres examples: |
- [action](genre)
- [thriller](genre) , [fantacy](genre)
- [comedy](genre) , [scince fiction](genre) , [family](genre)
- [adventure](genre) , [romance](genre) , [animation](genre) , [mystery](genre)
lookup: genre examples: |
- drama
- crime
- documentary
- history
- war
intent: actors examples: |
- [liamneeson](actor) , [johnturturro](actor) , [liamneeson](actor)
- [jessicalange](actor)
- [johnhurt](actor) [denzelwashington](actor)
- [rheaperlman](actor)
lookup: actor examples: |
- john travolta
- gene hackman
- rene russo
- sylvester stallone
- mary steenburgen
- george dzundza
- christine cavanaugh
- danny mann
- susan sarandon
- sean penn
intent: directors examples: |
- [kevin smith](director) , [frank marshall](director) , [carl franklin](director)
- [david anspaugh](director) [iain softley](director)
- [tom dicillo](director)
lookup: director examples: |
- carl franklin
- john mctiernan
- iain softley
- christopher ashley
- larry clark
- tom dicillo
- bryan spicer
intent: keywords examples: |
- [psycho path](keyword) [teenage girl](keywords) [kitchen](keyword) [world war ii](keywords)
- [pilot](keyword) , [kingdom](keyword) , [chef](keyword)
- [paris](keyword) [spy](keyword)
- [revenge](keyword)
lookup: keyword examples: |
- stand-upcomedy
- new york
- sport
- cops
- gangster
- detective
- undercover
- security camera
- gold
- birthday
- mountain
- airplane
- hotel
- fbi
- law
- bestfriend
- village
- magic
- hostage
- suicide
- politics
- prison
- media
- paris
- cat
- underdog
- bride
- taxi
- engineer
- hell
- gentleman
i was expecting that if a code runs on the jupyter notebook it should be running on the rasa actions