Tried to aggregate the values from the column C if the column A & B are null. I have tried the code, it was working fine for few tables and got a error for one of the table. Any Suggestions?
Code :
from tabula import read_pdf
from tabulate import tabulate
from tabula import read_pdf
import pandas as pd
import numpy as np
# guess=False, pages=page,stream=True
Page_No = 45
tables = read_pdf('/content/210812154731_DECK CRANE - MACGREGOR HAGGLUND - TG 5795-36.5 130-2 - N00024-93-C-2220 - PARTS MANUAL.pdf', pages=Page_No)
data_df = pd.DataFrame(tables[0])
# data_df['Dump'] = data_df.iloc[:,0:1]
out= data_df.ffill().groupby(['Item', 'SMR','CAGE','Part Number','Qty'], as_index=False)['Description'].agg(' '.join)
DATA:
A B C
12525 1FWE23 1H654D
14654
24798 14654 S56E82
65116 63546 38945
46456 46485 R68R45
AD545
A5D66 45346 QA6683
EXPECTED:
A B C
12525 1FWE23 1H654D 14654
24798 14654 S56E82
65116 63546 38945
46456 46485 R68R45 AD545
A5D66 45346 QA6683