How can I use/load the downloaded Hugging Face models from snapshot_download?

3.5k Views Asked by At

I have downloaded the model from Hugging Face using snapshot_download, e.g.,

from huggingface_hub import snapshot_download

snapshot_download(repo_id="facebook/nllb-200-distilled-600M", cache_dir="./")

And when I list the directory, I see:

ls ./models--facebook--nllb-200-distilled-600M/snapshots/bf317ec0a4a31fc9fa3da2ce08e86d3b6e4b18f1/

Output:

config.json@             README.md@                tokenizer_config.json@
generation_config.json@  sentencepiece.bpe.model@  tokenizer.json@
pytorch_model.bin@       special_tokens_map.json@

I can load the model locally, but I'll have to guess the snapshot hash, e.g.,

from transformers import AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained(
    "./models--facebook--nllb-200-distilled-600M/snapshots/bf317ec0a4a31fc9fa3da2ce08e86d3b6e4b18f1/",
    local_files_only=True
)

That works, but how do I load the Hugging Face model without guessing the hash?

1

There are 1 best solutions below

0
On

You can have a better directory management by making a separate directory instead of using a local one for the snapshot download, e.g.

from huggingface_hub import snapshot_download

snapshot_download(
    repo_id="facebook/nllb-200-distilled-600M", 
    cache_dir="./huggingface_mirror"
)

Then you can load the model using the cache_dir keyword argument:

from transformers import AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained(
    "facebook/nllb-200-distilled-600M",  
    cache_dir="huggingface_mirror",
    local_files_only=True
)