When I try to import a spacy Language model I got the follow exception:
File "D:\anaconda3\lib\site-packages\spacy\language.py", line 1037, in __call__
doc = self._ensure_doc(text)
File "D:\anaconda3\lib\site-packages\spacy\language.py", line 1130, in _ensure_doc
return Doc(self.vocab).from_bytes(doc_like)
File "spacy\tokens\doc.pyx", line 1359, in spacy.tokens.doc.Doc.from_bytes
File "D:\anaconda3\lib\site-packages\srsly\_msgpack_api.py", line 27, in msgpack_loads
msg = msgpack.loads(data, raw=False, use_list=use_list)
File "D:\anaconda3\lib\site-packages\srsly\msgpack\__init__.py", line 79, in unpackb
return _unpackb(packed, **kwargs)
File "srsly\msgpack\_unpacker.pyx", line 199, in srsly.msgpack._unpacker.unpackb
srsly.msgpack.exceptions.ExtraData: unpack(b) received extra data.
127.0.0.1 - - [20/Jan/2024 15:36:18] "POST / HTTP/1.1" 500 -
Now some mentions:
- if I load the model only "locally" it works perfect (let's say I call the function my_func() that loads the model and use it - works fine), BUT the problem appear if I call the function my_func() via Flask: I have a running local server, and when I press a button, the my_func() is called to process some text - here appear the problem
- spacy version is 3.7.2, flask version is 2.2.2, srsly version is 2.4.8, msgpack version is 1.0.7
- here is the Flask code for that call the function (after a file is uploaded, it take the file text and process it via my_func()):
@app.route('/', methods=['GET', 'POST'])
def upload_file():
if request.method == 'POST':
print(request.files)
if 'file' not in request.files:
flash('No file part')
return redirect(request.url)
file = request.files['file']
if file.filename == '':
flash('No selected file')
return render_template('index.html')
# acces the file
file_name = file.filename
file_content = file.read()
# do something with file's text
result = my_func(file_content)
flash(result)
return render_template('index.html')
I tried many ways to import the spacy model: from a binary final with pickle, from the model folder created with nlp_model.to_disk(), or to create it directly every time: spacy.load("en_core_web_sm")
AGAIN, if I run without Flask (just to test function my_func() it works perfect, no error)