I've been searching for a way to encode all the country string values to numbers. Is there an effective way to do it?
I tried one-hot encoding but it increased the number of columns by a lot. I'm going to use cosine distance on this data so I can't use label encoding. Count and frequency encoding won't work for cases where the number of countries present in the dataset remains the same. So, please let me know if you know of any way to encode high cardinality categorical string data.