How to load Data Frame or csv file in spacy pipeline nlp?

1.4k Views Asked by At

I am trying to load data frame csv into spacy pipeline. I am getting argument string error here is my code.

from __future__ import unicode_literals
nlp = spacy.load('en')

data = pd.read_csv("sometextdata.csv")
text = []
for line in data.Line:
    text.append(clean_text(line))

    text_spacy = nlp(data['Line'])
    data['Line'].apply(nlp)
    document = nlp(text)
TypeError: Argument 'string' has incorrect type (expected unicode, got str)

I tried to load in different ways i got same error.

Platforms : OS - Mac and python 2.7

1

There are 1 best solutions below

0
On

You should convert variable text to unicode. As you can see for now has str type. As example you can try convert like

document = nlp(unicode(text))

or like

document = nlp(text.decode())