First,I download CLIP checkpoint in https://huggingface.co/openai/clip-vit-large-patch14/tree/main and change the path to /home/yuxiang/GLIGEN/pretrains/clip-vit-large-patch14.The code are as follows.
model = CLIPModel.from_pretrained(version).cuda()
processor = CLIPProcessor.from_pretrained(version)
Then,I download GLIGEN checkpoint "Box+Text+Image" and change the path to ckpt="/home/yuxiang/raida/gligen/checkpoint_generation_text_image.bin",
Finally,I run the code:
if meta['ckpt'] == "/home/yuxiang/raida/gligen/checkpoint_generation_text_image.bin":
run(meta, args, starting_noise)`
and when it goes
` for phrase, image in zip(phrases,images):
image_features.append( get_clip_feature(model, processor, image, is_image=True) )
text_features.append( get_clip_feature(model, processor, phrase, is_image=False) )`
`def get_clip_feature(model, processor, input, is_image=False):
which_layer_text = 'before'
which_layer_image = 'after_reproject'
if is_image:
if input == None:
return None
image = Image.open(input).convert("RGB")
inputs = processor(images=[image], return_tensors="pt", padding=True)
inputs['pixel_values'] = inputs['pixel_values'].cuda() # we use our own preprocessing without center_crop
inputs['input_ids'] = torch.tensor([[0,1,2,3]]).cuda() # placeholder
outputs = model(**inputs)
feature = outputs.image_embeds
if which_layer_image == 'after_reproject':
feature = project( feature, torch.load('projection_matrix').cuda().T ).squeeze(0)
feature = ( feature / feature.norm() ) * 28.7
feature = feature.unsqueeze(0)
**outputs = model(inputs) it was blocked and no error was shown,I try to debug but it shows enter image description here .I don't know where I was wrong.
I want to run the code.