I am building a recommender system which does Multi Criteria based ranking of car alternatives. I just need to do ranking of the alternatives in a meaningful way. I have ways of asking user questions via a form.
Each car will be judged on the following criteria: price, size, electric/non electric, distance etc. As you can see its a mix of various data types, including ordinal, cardinal(count) and quantitative dat.
My question is as follows:
Which technique should I use for incorporating all the models into a single score Which I can rank. I looked at normalized Weighted sum model, but I have a hard time assigning weights to ordinal(ranked) data. I tried using the SMARTER approach for assigning numerical weights to ordinal data but Im not sure if it is appropriate. Please help!
After someone can help me figure out answer to finding the best ranking method, what if the best ranked alternative isnt good enough on an absolute scale? how do i check that so that enlarge the alternative set further?
3.Since the criterion mention above( price, etc) are all on different units, is there a good method to normalized mixed data types belonging to different scales? does it even make sense to do so, given that the data belongs to many different types?
any help on these problems will be greatly appreciated! Thank you!
good question.
I would recommend you to apply AHP to assign the weights of each criteria and TOPSIS to score and rank the criteria.
Most algorithms of MCDC (Multi-Criteria Decision Making) have, indeed, normalisation methods.
Let's analyse your case:
Your criteria is : Price, Size, Electric/Non electric, Distance.
Price, size and distance can be computed as integer/float numbers while for the qualitative data point you have some options...
You should use boolean logic if your decision space is comprised by cars that are fully electric or fully not electric, but nothing in the middle. Fuzzy logic if there are different degrees in which your car is electric (for instance, if you have an hybrid car). You should use intuisionistic fuzzy logic if you want to also consider the degree in which a certain car is NOT electric. You should use neutrosophic logic if you have incomplete information, so let's say there are some cars that you cannot tell what they are.
To simplify and since you only have two categories, I would stick to boolean logic in your particular case and I'm assuming the electric category is desired over non-electric.
Let's go through the TOPSIS algorithm 4 ...
From your example, the decision matrix would look something like this:
Now, you have to compute the normalised decision matrix. To do that first you have to compute the performance value.
The formula is:
That means that for each criteria you have to power each case by 2, sum all cases and then compute the square root of the sum.
So...
Once you have the performance score you can normalise. To do that you simply compute the division between each value of your criteria with the corresponding performance score.
Now you have to compute the weighted normalised decision matrix. (I'm assuming you already assigned weights, if you didn't you can check AHP algorithm[5]).
TOPSIS algorithm is based on the idea that the most desirable alternative is the one that has the closest geometric distance to the ideal solution and the largest geometric distance to the anti-ideal solution.
We need to understand that there is some criteria that is a benefit and others that are costs. So, for instance, we might want to maximise size and type but minimise price and distance.
Based on that, let's compute the ideal and anti-ideal solution:
Afterwards, for each car you have to calculate the euclidean distance with the ideal and anti ideal solution:
The formula is...
For instance, for the distance between car1 and the ideal solution would be
((0.07-0.03)**2 + (0.04-0.09)**2 + (0.3-0.3)**2 + (0.20-0.07)**2) ** 0.5
In python you can do that with Spicy Library. [6]
Once you calculate both distances to ideal and anti ideal solutions for each car alternative you have to compute the performance score which is basically a ratio.
So, for each car alternative, distance to i- / (distance to i- + distance to i+).
Once you get the performance score of each car alternative to sort them by descending order and you have their respective ranks.
Resources:
REFERENCES: