I am trying to follow this article on medium Article.
I had a few problems with it so the remain chang eI did was to the TrainingArguments object I added gradient_checkpointing_kwargs={'use_reentrant':False},.
So now I have the following objects:
peft_training_args = TrainingArguments(
output_dir = output_dir,
warmup_steps=1,
per_device_train_batch_size=1,
gradient_accumulation_steps=4,
max_steps=100, #1000
learning_rate=2e-4,
optim="paged_adamw_8bit",
logging_steps=25,
logging_dir="./logs",
save_strategy="steps",
save_steps=25,
evaluation_strategy="steps",
eval_steps=25,
do_eval=True,
gradient_checkpointing=True,
gradient_checkpointing_kwargs={'use_reentrant':False},
report_to="none",
overwrite_output_dir = 'True',
group_by_length=True,
)
peft_model.config.use_cache = False
peft_trainer = transformers.Trainer(
model=peft_model,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
args=peft_training_args,
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
)
And when I call peft_trainer.train() I get the following error:
AttributeError: 'torch.dtype' object has no attribute 'itemsize'
I'm using Databricks, and my pytorch version is 2.0.1+cu118
if you use transformers version latest(v4.39.1), plz try to downgrade to v4.38.2. I solved it that way.