How to use GPT2 as a Question-Answering System (What to put in context?)

48 Views Asked by At

I have tried to implement GPT2 as a question answering system with Pytorch. I copied their example code for how to do this into a separate python file and let it run. The code works, however the answer I get to the specified question is the context I have provided it with. The documentation of the pipeline() function I use here says specifying "context" is necessary for the function to run. So basically I need to give the system the answer beforehand, in order for it to be able to give the answer to me. Rather useless. I had hoped to find a way to use what the model has learnt from other dataset(s) it has been pre-trained on to generate an answer to my question. I could not find any version of question-answering with GPT2 that did not rely on specified context. Is there any way to generate an answer based on the question without giving it the answer beforehand?
Or should I just use e.g. the entirety of Wikipedia as context?

If it helps someone, the code I currently am using:

from transformers import AutoTokenizer, GPT2ForQuestionAnswering
import torch
from transformers import pipeline

tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2-large")

model = GPT2ForQuestionAnswering.from_pretrained("openai-community/gpt2-large")

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

inputs = tokenizer(question, text, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)

answer_start_index = outputs.start_logits.argmax()
answer_end_index = outputs.end_logits.argmax()

predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]

# target is "nice puppet"
target_start_index = torch.tensor([14])
target_end_index = torch.tensor([15])

outputs = model(**inputs, start_positions=target_start_index, end_positions=target_end_index)
loss = outputs.loss

question_answerer = pipeline("question-answering", model=model, tokenizer=tokenizer)
question_answerer = question_answerer(question=question, context = text)
print(question_answerer)

and I got

Some weights of GPT2ForQuestionAnswering were not initialized from the model checkpoint at openai-community/gpt2-large and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. 
{'score': 0.05718426778912544, 'start': 0, 'end': 28, 'answer': 'Jim Henson was a nice puppet'}
0

There are 0 best solutions below