I don't quite understand the difference between a label and a phrase in Carrot2, and they don't seem to give a clear distinction i http://doc.carrot2.org/. I tried printing them both out, but they are apparently the same(using kmeansclustering). Can somebody clear this up for me?
I was also wondering about the score. After the clustering, my clusters don't have any scores attached to them, am I supposed to compute these myself?
Regarding the similarity, is it possible to use Carrot2 to determine how similar a query is to the clusters?
The exact meaning of label, phrase and score varies across the algorithms. In general, a label can consist of one or more phrases. Some algorithms always produce one-phrase labels, others may output labels consisting of multiple phrases. For the k-means clustering, you can set the number of words per label using the labelCount attribute.
Cluster score is also algorithm-specific and is the clustering algorithm's beliefs on the quality of the cluster. The current implementation of K-means indeed does not produce any score. If you'd like to compute one of the common cluster quality metrics, the easiest way would probably be to directly extend the code of the algorithm as it would give you access to the vector space model you'd need to calculate centroids and distances.
When it comes to computing the similarity between query and a cluster, there are again many possibilities. For k-means clusters you could for instance assume the vector space model and compute the distance between the vector corresponding to the query and the cluster's centroid.