llama.cpp conversion of finetuned HF ( huggingface ) fails for LLaMA2 - 7B model

219 Views Asked by At

i use the simple https://github.com/huggingface/trl/blob/main/examples/scripts/sft.py with some custom data and llama-2-7b-hf as the base model. Post training , it invokes trainer.save_model and the output dir has the following contents

-rw-rw-r-- 1 ubuntu ubuntu 5100 Jan 12 14:04 README.md

-rw-rw-r-- 1 ubuntu ubuntu 134235048 Jan 12 14:04 adapter_model.safetensors

-rw-rw-r-- 1 ubuntu ubuntu 576 Jan 12 14:04 adapter_config.json

-rw-rw-r-- 1 ubuntu ubuntu 1092 Jan 12 14:04 tokenizer_config.json

-rw-rw-r-- 1 ubuntu ubuntu 552 Jan 12 14:04 special_tokens_map.json

-rw-rw-r-- 1 ubuntu ubuntu 1842948 Jan 12 14:04 tokenizer.json

-rw-rw-r-- 1 ubuntu ubuntu 4219 Jan 12 14:04 training_args.bin

-rw-rw-r-- 1 ubuntu ubuntu 4827151012 Jan 12 14:04 adapter_model.bin

as you can see it has no model.safetensors as required by convert.py .. i tried a bunch of other options to save the model ( trainer.model.save_pretrained , for example ) but the file was always adapter_model.safetensors.

i tried convert-hf-to-gguf.py as well and it too complains about model.safetensors ( and that too after suppressing the error which complains about causalLLAMA architecture not supported )

Is there any other convert script that handles such adapter safetensors ( i guess all models finetuned via peft will definitely be called adapter**_ ) ? when i went through the code i also noticed that the MODEL_ARCH only accomodates "LLAMA" and not "LLAMA2" ..is that why it also fails to find param names from adapter_safetensors in the MODEL_ARCH tmap methods ?

0

There are 0 best solutions below