I am following the fine tuning guide on the following website: https://www.labellerr.com/blog/hands-on-with-fine-tuning-llm/
I have successfully fine tuned the Falcon-7b model on a dataset from huggingface. However, when I load the fine tuned model into my jupyter notebook, the kernel dies. Is there any possible way to solve the problem? One way to solve this, according to my understanding is to save the model in shards. However I am facing difficulty in saving the fine tuned model in shards. I will be really grateful for your help. Thanks! Happy Coding!
I ran the following code after fine tuning the model:
# Define the directory where you want to save the fine-tuned model
output_dir = "./fine_tuned_model"
# Save the fine-tuned model using the save_model method
trainer.save_model(output_dir)
# Optionally, you can also upload the model to the Hugging Face model hub
# if you want to share it with others
trainer.push_to_hub("omarfarooq908/falcon-7b-finetuned01")
Jupyter notebook kernel dies when I load the model:
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM
config = PeftConfig.from_pretrained("omarfarooq908/falcon-7b-finetuned01")
model = AutoModelForCausalLM.from_pretrained("ybelkada/falcon-7b-sharded-bf16")
model = PeftModel.from_pretrained(model, "omarfarooq908/falcon-7b-finetuned01")
My GPU specs: NVIDIA Quadro P5000 16 GB VRAM
I was expecting the fine tuned model to load successfully, however the kernel dies.