Error "Layer is not connected, no input to return" when loading pre-trained model

43 Views Asked by At

I am following this tutorial in Colab to fine-tune GPT2 using LoRA. In the tutorial, the pre-trained GPT2 is loaded as

# Load the original model.
preprocessor = keras_nlp.models.GPT2CausalLMPreprocessor.from_preset(
    "gpt2_base_en",
    sequence_length=128,
)
lora_model = keras_nlp.models.GPT2CausalLM.from_preset(
    "gpt2_base_en",
    preprocessor=preprocessor,
)

Now, I would need to use a model from HuggingFace and not one of the "preset" models - and here I'm stuck.

The best I've been able to write is the following code, which unfortunately gives the error "AttributeError: Layer tfgpt2lm_head_model is not connected, no input to return."

from transformers import GPT2Tokenizer, TFGPT2LMHeadModel

model_name = 'Somewhere/Some_model'

tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = TFGPT2LMHeadModel.from_pretrained(model_name)

# Load the original model.
preprocessor = keras_nlp.models.GPT2CausalLMPreprocessor(
    tokenizer,
    sequence_length=128,
)

lora_model = keras_nlp.models.GPT2CausalLM(
    model,
    preprocessor=preprocessor,
)

Error:

All PyTorch model weights were used when initializing TFGPT2LMHeadModel.

All the weights of TFGPT2LMHeadModel were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-18-c24466b7f454> in <cell line: 15>()
     13 )
     14 
---> 15 lora_model = keras_nlp.models.GPT2CausalLM(
     16     backbone = backbone_model,
     17     preprocessor=preprocessor,

1 frames
/usr/local/lib/python3.10/dist-packages/keras_nlp/src/models/gpt2/gpt2_causal_lm.py in __init__(self, backbone, preprocessor, **kwargs)
    156         **kwargs,
    157     ):
--> 158         inputs = backbone.input
    159         hidden_states = backbone(inputs)
    160         outputs = backbone.token_embedding(hidden_states, reverse=True)

/usr/local/lib/python3.10/dist-packages/keras/src/engine/base_layer.py in input(self)
   2072         """
   2073         if not self._inbound_nodes:
-> 2074             raise AttributeError(
   2075                 "Layer " + self.name + " is not connected, no input to return."
   2076             )

AttributeError: Layer tfgpt2lm_head_model is not connected, no input to return.

My next idea was getting rid of lora_model altogether, but the following code in the tutorial uses it to define LoRA layers and I haven't been able to adapt it yet.

I'm a beginner and it could easily be a trivial problem, but I have no idea how to solve it: any input would be really appreciated.

0

There are 0 best solutions below