I'm able to implement the code in python, but getting this error while implementing in spark udf.
PythonException: 'ImportError: cannot import name 'CommitOperationAdd' from 'huggingface_hub' (/databricks/python/lib/python3.8/site-packages/huggingface_hub/init.py)'.
can we use tuner007/pegasus_qa model in spark udf ?
This is the code
import torch
from transformers import PegasusForConditionalGeneration, PegasusTokenizer
model_name = 'tuner007/pegasus_qa'
torch_device = 'cuda' if torch.cuda.is_available() else 'cpu'
tokenizer = PegasusTokenizer.from_pretrained(model_name)
model = PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_device)
def get_answer(df):
question = df['question'][0]
context = df['brand_desc'][0]
model_inputs = tokenizer(question, context, truncation=True, padding='longest', return_tensors="pt").to(torch_device)
translated = model.generate(**model_inputs, max_new_tokens=100)
tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)
df['answer'] = tgt_text[0]
df['error'] = ''
return df
data = {
'id': [
1,
2,
3
],
'context': [
'DeepSet DeBERTa is a powerful transformer-based model.',
'It is trained on the SQuAD 2.0 dataset.',
'Apple is good for health.'
],
'question': [
'What is DeepSet DeBERTa?',
'What dataset is DeBERTa trained on?',
'What is fruit name?'
]
}
pandas_df = pd.DataFrame(data)
df1 = spark.createDataFrame(pandas_df)
peagasus_model_output_df = (
df1
.groupby(['id'])
.applyInPandas(get_answer, schema = schema)
)
transformers version : 4.30.2
huggingface-hub Version: 0.15.1
Issue- I'm able to implement this code in python, but getting this error while implementing in spark udf.
PythonException: 'ImportError: cannot import name 'CommitOperationAdd' from 'huggingface_hub' (/databricks/python/lib/python3.8/site-packages/huggingface_hub/init.py)'.