K-means clustering in sklearn, number of clusters is known in advance (it is 2). There are multiple features. Feature values are initially without any weight assigned, i.e. they are treated equally weighted. However, task is to assign custom weights to each feature, in order to get best possible clusters separation. How to determine optimum sample weights (sample_weight) for each feature, in order to get best possible separation of the two clusters? If this is not possible for k-means, or for sklearn, I am interested in any alternative clustering solution, the point is that I need method of automatic determination of appropriate weights for multivariate features, in order to maximize clusters separation.
Sklearn k-means clustering (weighted), determining optimum sample weight for each feature?
832 Views Asked by zlatko At
2
There are 2 best solutions below
0

In meantime, I have implemented following: clustering by each component separately, then calculating silhouette score, calinski harabasz score, dunn score and inverse davies bouldin score for each component (feature) separately. Then scaling those scores to same magnitude, then PCA them to 1 feature. This produced weights for each component. It seems this approach produces reasonable results. I suppose better approach would be full factorial experiment (DOE), but it seems that this simple approach produces satisfactory results as well.
Related Questions in MACHINE-LEARNING
- When dealing with databases, does adding a different table when we can use a simple hash a good thing?
- How to not load all database records in my TListbox in Firemonkey Delphi XE8
- microsoft odbc driver manager data source name not found and no default driver specified
- Cloud Connection with Java Window application
- Automatic background scan if user edit column?
- Jmeter JDBC Connection Configuration Parametrization of Database URL for accessing SQL Database
- How to grant privileges to current user
- MySQL: Insert a new row at a specific primary key, or alternately, bump all subsequent rows down?
- Inserting and returning autoidentity in SQLite3
- Architecture: Multiple Mongo databases+connections vs multiple collections with Express
Related Questions in SCIKIT-LEARN
- When dealing with databases, does adding a different table when we can use a simple hash a good thing?
- How to not load all database records in my TListbox in Firemonkey Delphi XE8
- microsoft odbc driver manager data source name not found and no default driver specified
- Cloud Connection with Java Window application
- Automatic background scan if user edit column?
- Jmeter JDBC Connection Configuration Parametrization of Database URL for accessing SQL Database
- How to grant privileges to current user
- MySQL: Insert a new row at a specific primary key, or alternately, bump all subsequent rows down?
- Inserting and returning autoidentity in SQLite3
- Architecture: Multiple Mongo databases+connections vs multiple collections with Express
Related Questions in CLUSTER-ANALYSIS
- When dealing with databases, does adding a different table when we can use a simple hash a good thing?
- How to not load all database records in my TListbox in Firemonkey Delphi XE8
- microsoft odbc driver manager data source name not found and no default driver specified
- Cloud Connection with Java Window application
- Automatic background scan if user edit column?
- Jmeter JDBC Connection Configuration Parametrization of Database URL for accessing SQL Database
- How to grant privileges to current user
- MySQL: Insert a new row at a specific primary key, or alternately, bump all subsequent rows down?
- Inserting and returning autoidentity in SQLite3
- Architecture: Multiple Mongo databases+connections vs multiple collections with Express
Related Questions in UNSUPERVISED-LEARNING
- When dealing with databases, does adding a different table when we can use a simple hash a good thing?
- How to not load all database records in my TListbox in Firemonkey Delphi XE8
- microsoft odbc driver manager data source name not found and no default driver specified
- Cloud Connection with Java Window application
- Automatic background scan if user edit column?
- Jmeter JDBC Connection Configuration Parametrization of Database URL for accessing SQL Database
- How to grant privileges to current user
- MySQL: Insert a new row at a specific primary key, or alternately, bump all subsequent rows down?
- Inserting and returning autoidentity in SQLite3
- Architecture: Multiple Mongo databases+connections vs multiple collections with Express
Related Questions in FEATURE-CLUSTERING
- When dealing with databases, does adding a different table when we can use a simple hash a good thing?
- How to not load all database records in my TListbox in Firemonkey Delphi XE8
- microsoft odbc driver manager data source name not found and no default driver specified
- Cloud Connection with Java Window application
- Automatic background scan if user edit column?
- Jmeter JDBC Connection Configuration Parametrization of Database URL for accessing SQL Database
- How to grant privileges to current user
- MySQL: Insert a new row at a specific primary key, or alternately, bump all subsequent rows down?
- Inserting and returning autoidentity in SQLite3
- Architecture: Multiple Mongo databases+connections vs multiple collections with Express
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular # Hahtags
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
What I understand from sklearn docs, sample_weight is used to give weights for each observations (samples), not features.
If you want to give weight to your features, you can refer to this post: How can I change feature's weight for K-Means clustering?