Error Message like RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

41 Views Asked by At

I'm trying to run the transformer model, which is listed on huggingface. My code is identical to link below https://medium.com/@pazuzzu/in-depth-llm-fine-tuning-guide-efficiently-fine-tune-and-use-zephyr-7b-beta-assistant-using-lora-e23d8151e067

Whenever I tried to run above code in my local gpu server, Error message such as

Traceback (most recent call last):
  File "Main.py", line 163, in <module>
    trainer.train()
  File "/home/jhkcool97/.local/lib/python3.8/site-packages/transformers/trainer.py", line 1561, in train
    return inner_training_loop(
  File "/home/jhkcool97/.local/lib/python3.8/site-packages/transformers/trainer.py", line 1895, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/home/jhkcool97/.local/lib/python3.8/site-packages/transformers/trainer.py", line 2830, in training_step
    self.accelerator.backward(loss)
  File "/home/jhkcool97/.local/lib/python3.8/site-packages/accelerate/accelerator.py", line 1964, in backward
    self.scaler.scale(loss).backward(**kwargs)
  File "/home/jhkcool97/.local/lib/python3.8/site-packages/torch/_tensor.py", line 522, in backward
    torch.autograd.backward(
  File "/home/jhkcool97/.local/lib/python3.8/site-packages/torch/autograd/__init__.py", line 266, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/home/jhkcool97/.local/lib/python3.8/site-packages/torch/autograd/function.py", line 289, in apply
    return user_fn(self, *args)
  File "/home/jhkcool97/.local/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 275, in backward
    tensors = ctx.saved_tensors
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [6, 1, 201, 201]] is at version 16; expected version 14 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
0

There are 0 best solutions below