I want to clustering my document using BERT embedding from Sentence Transoformer especially bert-base-nli-mean tokens, and i want to cluster that embedding with kmeans clustering but i have a problem, can i using kmeans clustering using cosine distance?
solution and the code for this problem?
Yes, you can use K-Means clustering with BERT embeddings obtained from Sentence Transformers like bert-base-nli-mean-tokens. However, the standard implementation of K-Means in libraries like scikit-learn uses Euclidean distance, not cosine distance. To cluster embeddings using cosine distance, you have a few option.