FineTuning with Llama 7b - changing Repo ID?

103 Views Asked by At

I am currently trying very hard to get my fine tuning with PEFT done (in colab) and i am stuck when loading my model. I am getting the below error and cant find out how to fix it. Tried several times to change my repo ID but it doesnt change anything.

Loading model works just fine:

model_checkpoint = 'meta-llama/Llama-2-7b-chat-hf'

tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForCausalLM.from_pretrained(
model_checkpoint,
torch_dtype=torch.float16,
device_map="auto",
load_in_4bit=True,
)

Then in the next step:

tokenizer = AutoTokenizer.from_pretrained(model) if tokenizer.pad_token is None:     tokenizer.add_special_tokens({'pad_token': '[PAD]'})

Here i get following error:

HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (v_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (o_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear4bit(in_features=4096, out_features=11008, bias=False)
          (up_proj): Linear4bit(in_features=4096, out_features=11008, bias=False)
          (down_proj): Linear4bit(in_features=11008, out_features=4096, bias=False)
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)'.
1

There are 1 best solutions below

0
On

AutoTokenizer.from_pretrained accepts repo id as string in the first argument and you're passing model variable which is an instance of model created by model = AutoModelForCausalLM.from_pretrained.

You can change your code similar to tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf", token=access_token)

or

tokenizer = AutoTokenizer.from_pretrained(model_checkpoint, token=access_token)

to make it work.