I am trying to get into NLP with Hugging Face, Presidio and spaCy. Following the Presidio tutorial, I tried downloading a pre-trained spaCy transformer named de_dep_news_trf like this:
import transformers
from huggingface_hub import snapshot_download
from transformers import AutoTokenizer, AutoModelForTokenClassification
transformers_model = 'spacy/de_dep_news_trf'
snapshot_download(repo_id=transformers_model)
# Instantiate to make sure it's downloaded during installation and not runtime
AutoTokenizer.from_pretrained(transformers_model)
AutoModelForTokenClassification.from_pretrained(transformers_model)
The line AutoTokenizer.from_pretrained(transformers_model) fails:
OSError: spacy/de_dep_news_trf does not appear to have a file named config.json. Checkout 'https://huggingface.co/spacy/de_dep_news_trf/main' for available files.
While https://huggingface.co/spacy/de_dep_news_trf/main leads to a 404, there is indeed no config.json at https://huggingface.co/spacy/de_dep_news_trf/tree/main. There is a config.cfg. Can I use that somehow? If not, how can I use this transformer?