Hugging Face- Not able to implement certain code in pyspark

314 Views Asked by At

I'm able to implement the code in python, but getting this error while implementing in spark udf.

PythonException: 'ImportError: cannot import name 'CommitOperationAdd' from 'huggingface_hub' (/databricks/python/lib/python3.8/site-packages/huggingface_hub/init.py)'.

can we use tuner007/pegasus_qa model in spark udf ?

This is the code

import torch

from transformers import PegasusForConditionalGeneration, PegasusTokenizer

model_name = 'tuner007/pegasus_qa'

torch_device = 'cuda' if torch.cuda.is_available() else 'cpu'

tokenizer = PegasusTokenizer.from_pretrained(model_name)

model = PegasusForConditionalGeneration.from_pretrained(model_name).to(torch_device)

def get_answer(df):

question = df['question'][0]

context = df['brand_desc'][0]

model_inputs = tokenizer(question, context, truncation=True, padding='longest', return_tensors="pt").to(torch_device)

translated = model.generate(**model_inputs, max_new_tokens=100)

tgt_text = tokenizer.batch_decode(translated, skip_special_tokens=True)

df['answer'] = tgt_text[0]

df['error'] = ''

return df

data = {

'id': [

    1,

    2,

    3

],

'context': [

    'DeepSet DeBERTa is a powerful transformer-based model.',

    'It is trained on the SQuAD 2.0 dataset.',

    'Apple is good for health.'

],

'question': [

    'What is DeepSet DeBERTa?',

    'What dataset is DeBERTa trained on?',

    'What is fruit name?'

]

}

pandas_df = pd.DataFrame(data)

df1 = spark.createDataFrame(pandas_df)

peagasus_model_output_df = (

df1

.groupby(['id'])

.applyInPandas(get_answer, schema = schema)

)

transformers version : 4.30.2

huggingface-hub Version: 0.15.1

Issue- I'm able to implement this code in python, but getting this error while implementing in spark udf.

PythonException: 'ImportError: cannot import name 'CommitOperationAdd' from 'huggingface_hub' (/databricks/python/lib/python3.8/site-packages/huggingface_hub/init.py)'.

0

There are 0 best solutions below