Getting error "empty vocabulary; perhaps the documents only contain stop words" on implementation of CountVectorizer()

1.1k Views Asked by At

I am trying to perform a one-hot encoding of a categorical variable using CountVectorizer(). I am able to implement that on all the columns except one column named "Drill Centre". This column contains three unique value 'A', 'B', 'C'. When I am implementing CountVectorizer on this column then it gives me an error of empty vocabulary; perhaps the documents only contain stop words

vec=CountVectorizer()
train_drill=vec.fit_transform(X_train['Drill Centre']).toarray()
test_drill=vec.transform(X_test['Drill Centre']).toarray()

enter image description here

0

There are 0 best solutions below