I have two columns in Pandas: A and B, each of which contains strings of terms. My objective is to find the entry in column B which is most similar to column A. I am already using the TF-IDF to do this, but sometimes there are synonyms which do not obviously match e.g. money and currency.
How can I find matches which also include synonyms?
I'm not sure how TF-IDF would be of use here if you are working with individual word pairs.
Anyways, there are two obvious solutions to this.
Use a traditional knowledge base, I would recommend Wordnet for this use case, it's widely considered a standard in the industry.
The second option would be to use the machine learning algorithm Word2Vec (or a variant like Glove). I would say this is the easiest solution if you use a model with is already trained like the Google News one. Look into Gensim's implementation to load the model and compute similarities.