Pandas read excel and flag rows with striketrough

393 Views Asked by At

I have to work on an Excel sheet with python so I can later put it in a database.

My problem is, that I have quite a lot of rows/cells with striketrough. I want to read the excel normally and flag every row with one or more striketrough cells. I just found a solution for skipping these rows but I want to flag it.

I did find a way to print the strikethrough rows using openpyxl but I wanted to use pandas because I have to add and cut a lot of columns later on and pandas is best for that.

1

There are 1 best solutions below

6
Timeless On BEST ANSWER

It's hard to imagine how looks like your spreadsheet but here is an example to give you the general logic. Basically, we use Font.strikethrough from to check if a the font of a cell is striked or not.

from openpyxl import load_workbook

ws = load_workbook("file.xlsx")["Sheet1"]

data = [[cell for cell in row] for row in ws.iter_rows(values_only=True)]

df = pd.DataFrame(data[1:], columns=data[0])

df["Flag"] = [
    any(cell.font.strikethrough for cell in row) for row in ws.iter_rows(min_row=2)
]

Output :

print(df)

  col1  col2   col3   Flag
0  foo     1  test1   True
1  bar     2  test2  False
2  qux     3  test3   True
3  baz     4  test4  False

Input used :

enter image description here