I started using multiple documents processing and found out I start receive different results for single documents, depending on the documents batch I put them in.
A code which can illustrate it (I can't use my real data):
in_docs = ...
doc_index = 10 # chosen randomly
result_from_batch = nlp(in_docs)[doc_index]
result_as_single_doc_request = nlp(in_doc[doc_index:doc_index+1])[0]
assert len(result_from_batch.sentences) == len(result_as_single_doc_request.sentences)
I want to emphasize that (as far as I experienced) a documents batch will always receive the same deterministic result, but single documents may receive different results depending on the batch the are put in.
Is this behavior known and expected? Does stanza support a way to ensure deterministic result?