What is the best way to model document similarity between different string parameters?

72 Views Asked by At

I have a problem of predicting solutions to problems faced by users.

The problem setting is like this:

We have a database of problems and solutions. For each problem we have three parameters to represent it.

  1. JobName (String - Name of the Job)
  2. JobId (Integer - Id of the Job)
  3. RootCause (String - Cause of that problem).

Each problem has a corresponding solution added by that user who faced that problem. That solutions parameter is

  1. Solution (String - Solution entered by user for that problem)

So we wanted to make use of that database and predict the solutions for new problems (a problem is a set of jobname, jobid, rootcause - all are strings)

We initially came up with this solution. We just want to identify problems(set of jobname, jobid, rootcause) similar to our query problem and give the solution to the closest problem. But in this case we don't have any way to measure the training error like we have in the house price prediction problems.

In general, how to we approach this problem, and what kind of machine learning models do we need to use ?

1

There are 1 best solutions below

5
On

Seems, you want to build kind of the recommendation system. According to the cause of the problem, suggest list of the recommended solutions. One of the possible solution - use word2vec for vectorisation RootCause and then try to find similar problems using, vectors similarity.