I am training fine-tuning a HuggingFace model by adding my own data and using LORA. However, I do not want to upload the file to HuggingFace, but store it on my local computer. This works for the tokenizer and the model, however the LoraConfig object cannot be stored. What do I make wrong?
Here is some of my code:
import transformers
trainer = transformers.Trainer(
model=model,
train_dataset=data['train'],
args=transformers.TrainingArguments(
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
warmup_steps=100,
max_steps=2,
learning_rate=2e-4,
fp16=False,
logging_steps=1,
output_dir='outputs'
),
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
model.config.use_cache = False # silence the warnings. Please re-enable for inference!
with torch.autocast("cuda"):
trainer.train()
tokenizer.save_pretrained("./")
model.save_pretrained("./")
config
"""
LoraConfig(peft_type=<PeftType.LORA: 'LORA'>, auto_mapping=None, base_model_name_or_path='xx', revision=None, task_type='CAUSAL_LM', inference_mode=False, r=16, target_modules=['q_proj', 'v_proj'], lora_alpha=32, lora_dropout=0.05, fan_in_fan_out=False, bias='none', modules_to_save=None, init_lora_weights=True, layers_to_transform=None, layers_pattern=None)
"""
out_file = open("config.json", "w")
json.dump(config, out_file, indent = 6)
TypeError: Object of type LoraConfig is not JSON serializable
The further usage would be:
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
#peft_model_id = "ybelkada/opt-6.7b-lora"
config = PeftConfig.from_dict("./")
model = AutoModelForCausalLM.from_pretrained("./", return_dict=True, load_in_8bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained("./")
# Load the Lora model
model = PeftModel.from_pretrained(model, config)
What is the main error there? Am I saving the correct config file?