generating response from flan-t5 based on prompt with large context

76 Views Asked by At

In the code below I am trying to use an LLM, the flan-t5 model from hugging face to answer a final question based on additional context I've given in the prompt. I'm asking it to return the email titles, for relevant emails sent to the job seeker based on the summaries added to the prompt. However what I'm getting generated by the LLM seems to just be a part of the prompt or maybe a very basic summarization of the prompt. Is there anything wrong with the structure or my code below? I'm trying to generate a response from the LLM based on the prompt I'm passing it. Is it just that flan-t5 is trained for summarization and maybe I need to give it like some few shot learning examples, or fine tune it? I'm surprised to be getting such a simple short response from flan-t5.

code:

from datasets import load_dataset
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, GenerationConfig, TrainingArguments, Trainer
import torch
import time
import evaluate
import pandas as pd
import numpy as np
import chromadb
from chromadb.config import Settings

# need huggingface apikey
from config import api_key

apikey=api_key


# loading pretrained model 

# set quantization configuration to load large model with less GPU memory
# this requires the `bitsandbytes` library

from torch import cuda, bfloat16
import transformers

device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'


bnb_config = transformers.BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type='nf4',
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=bfloat16
)



model_id='google/flan-t5-base'

hf_auth = apikey
model_config = transformers.AutoConfig.from_pretrained(
    model_id,
    use_auth_token=hf_auth
)



model_name='google/flan-t5-base'

original_model = AutoModelForSeq2SeqLM.from_pretrained(model_name, 
                 trust_remote_code=True,
    config=model_config,
#                  quantization_config=bnb_config,
    device_map='auto',
    use_auth_token=hf_auth,
    cache_dir='/home/scotsditch/stuff/scotsditch_storage/LLM/weights/huggingface/hub/',
torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(model_name)


print(LLM_prompt)

output:

These are summaries of emails that were sent to job seeker Bob Newhart: 

 An email with title: W2 Contract //Data Analyst // Remote (Only PST Candidate ) was sent to job seeker Bob Newhart on Tuesday, August 22, 2023 at 11:40 AM PDT.  It was for the position of Data Analyst.  It's location was Remote ( West Coast).  The employment type was Contract.  It had the required skills: SQL, Azure, Power BI, DataBricks, Elicit Requirements, Analytics, Reporting, healthcare, TSQL, Power BI, Data Visualization, Synapse, NLP, R, Python, AI.

 An email with title: Lead Data Scientist - O'Fallon, MO (Hybrid) was sent to job seeker Bob Newhart on Tuesday, August 22, 2023 at 07:16 AM PDT.  It was for the position of Lead Data Scientist.  It's location was O'Fallon, MO (Hybrid).  The employment type was contract.  It had the required skills: Masters or PhD in mathematics, statistics, computer science, or related fields, lead large data science projects, research, communication skills, predictive, batch, streaming, python, R, hadoop, spark, MySQL, anomaly detection, supervised learning, unsupervised learning, time-series, natural language processing, Numpy, SciPy, Pandas, Scikit-learn, Tensorflow, Keras, NLTK, Gensim, BERT, NetworkX, organized, self motivated, data visualization.

What are the titles of all emails for Data Scientist positions sent to job seeker Bob Newhart.

code:

# generating final response from LLM

inputs = tokenizer(LLM_prompt, return_tensors='pt')
output = tokenizer.decode(
    original_model.generate(
        inputs["input_ids"].cuda(), 
        max_new_tokens=50,
    )[0], 
    skip_special_tokens=True
)

print(output)

output:

Emails for Data Scientist positions sent to job seeker Bob Newhart:
0

There are 0 best solutions below