I'm using Mahout with the Pearson Correlation algorithm to compare and find similar users based on their preferences for several items. The problem I'm running into is that Mahout and/or Pearson is ignoring users that select the same preference for every item. Does anyone know if there is a way to configure Mahout to NOT ignore people that select the same preference value for every item.
Apache Mahout + Pearson Correlation Ignores Users With Same Preference For Every Item
957 Views Asked by SGT Grumpy Pants At
1
There are 1 best solutions below
Related Questions in MAHOUT
- Strange predictions using SVD in mahout
- Recommendation Based on the Item properties and user preference for item properties
- Mahout clustering: How to retrieve the name of a named vector
- Incorporating new articles in tfidf vector for online clustering
- Too small RMSE. Recommender systems
- Why is the evaluation of Mahout Recommender Systems with Movielens dataset so slow?
- Apache Mahout: how to add weight to neighborhood and get a recommendation?
- Mahout parallel k-means in Hadoop
- Hierarchical clustering of text, at scale
- Creating an Item-based Recommender using Apache Mahout
- Combine search engine and machine learning
- Mahout recommender evaluation - how to use a fixed test set
- PredictionIO suggest to like items that have already been liked
- Mahout 0.9: Using own test set instead of using split command
- How to resolve log4j warnings while executing 20newsgroup classification example of Mahout?
Related Questions in PEARSON
- Corrcoef in Matlab is very slow
- Calculating running window Spearman correlation and pvalue in R
- t-test of the Pearson correlation in R
- What's the difference between Pearson correlation similarity and adjust cosine similarity?
- R: logistic regression using frequency table, cannot find correct Pearson Chi Square statistics
- Item Based Similarity Metric
- Minimal p-value for scipy.stats.pearsonr
- Invalid use of group function; attempting to find pearson correlation
- how to find correlation between different dimension matrices in matlab
- Can we generate data from two exponential distributions with correlation -1
- Is Pearson correlation faster than Spearman correlation in R?
- How to use for loop to perform Pearson correlation in r
- Do a correlation test between two datasets of different sizes (one has 24, the other one 25 values) in R
- Apache Mahout + Pearson Correlation Ignores Users With Same Preference For Every Item
- Pearson Correlation without using zero element in Matlab
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
It is not a question of configuration. The Pearson correlation is undefined in this case, so there can be no similarity computed between them using this metric.
Essentially -- Pearson is the ratio of the two preference series' covariance to the product of their standard deviations. But when one or both sequences are identical, the standard deviation is 0, as is the covariance, so the correlation is 0/0.
(This and a few other Pearson gotchas are covered in Chapter 4 of Mahout in Action, and I'm author of this part of the book and code.)