Can I monitor progress of spacy parsing?

29 Views Asked by Zyxl At 19 March 2024 at 18:22

I have a simple program to process English text with spacy and output some of the info about the tokens. For a big text it takes a long time for spacy to process it. Is there a way to see how far the processing has progressed ideally as a percentage? I'm not using my own models, just ones provided by spacy.

import spacy

// load big text file into `text` variable

nlp = spacy.load("en_core_web_sm")
nlp.max_length = len(text)+1
doc = nlp(text)

// output info

Original Q&A

There are 1 best solutions below

ewz93 On 22 March 2024 at 12:10

In general I would not advice to parse the entire text as one big blob of text and instead try to split it into smaller paragraphs first.

For example, you can split at every \n\n first.

Then you can hand multiple documents to SpaCy at once using nlp.pipe(), which you can use a tqdm progress bar on.

Alternatively, you can create batches within the document and then concatenate the results.

Can I monitor progress of spacy parsing?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in NLP

Related Questions in MONITORING

Related Questions in SPACY

Related Questions in SPACY-3

Trending Questions

Popular # Hahtags

Popular Questions