ValueError: np.nan is an invalid document, expected byte or unicode string in TfidfVectorizer

676 Views Asked by At

I was just trying to run a fake news detection program . This is my code ( only error part )

#DataFlair - Initialize a TfidfVectorizer
tfidf_vectorizer=TfidfVectorizer(stop_words='english', max_df=0.7)
#DataFlair - Fit and transform train set, transform test set
tfidf_train=tfidf_vectorizer.fit_transform(x_train) 
tfidf_test=tfidf_vectorizer.transform(x_test)`

and getting error as

ValueError                                Traceback (most recent call last)

<ipython-input-19-bd6e732b0b7b> in <module>()
      3 
      4 #DataFlair - Fit and transform train set, transform test set
----> 5 tfidf_train=tfidf_vectorizer.fit_transform(x_train)
      6 tfidf_test=tfidf_vectorizer.transform(x_test)

4 frames

/usr/local/lib/python3.7/dist-packages/sklearn/feature_extraction/text.py in decode(self, doc)
    225         if doc is np.nan:
    226             raise ValueError(
--> 227                 "np.nan is an invalid document, expected byte or unicode string."
    228             )
    229 


ValueError: np.nan is an invalid document, expected byte or unicode string.
0

There are 0 best solutions below