save dataframe to an existing excel sheet without overwriting other information on that sheet

168 Views Asked by At

How can I save many dataframes to different existing excel sheets without overwriting other information on those sheets and also change the date format

I tried to save first dataframe

       with pd.ExcelWriter('nameofmyexistingexcel.xlsx', mode='a', engine='openpyxl', if_sheet_exists='overlay') as writer:
       mydataframe.to_excel(writer, sheet_name="nameofexistingsheet")

(I tried other various options too from pandas doc and from this forum)

It does some interaction with excel, but just breaks excel file , so it can't be open anymore.

When I choose simple option of saving to a new excel, it works fine.

2

There are 2 best solutions below

7
moken On BEST ANSWER

sheet["A1"] would define a cell object for cell 'A1' (first cell in excel).
To use it you need to define 'sheet' first as a Openpyxl worksheet object like below. Then you can set its attributes like 'number_format'.

import pandas as pd


mydataframe = pd.DataFrame({'Headers': ['Date1', 'Date2'],
                            'DatesA': ['20.10.2023', '21/10/2023'],
                            'DatesB': ['20.10.2023', '21/10/2023']
                            })

mydataframe["DatesA"] = mydataframe["DatesA"].apply(pd.to_datetime, dayfirst=True)
print(mydataframe.dtypes)

with pd.ExcelWriter('nameofmyexistingexcel.xlsx', mode='a', engine='openpyxl', if_sheet_exists='overlay') as writer:
    mydataframe.to_excel(writer, sheet_name="nameofexistingsheet")

    sheet = writer.sheets['nameofexistingsheet'] # Openpyxl worksheet
    # sheet["A1"].number_format = 'dd.mm.yyyy'
    for cell in sheet['B']: # Loop column B 
        if cell.row == 1:  # Skip row 1 
            continue
        ### Apply number format to all cells in Column B to max used row
        cell.number_format = 'dd.mm.yyyy' 

UPDATE
I've added the 'mydataframe' dataframe creation to my code example and updated to include two date columns. Column 'DatesA' is datetime whereas the same dates in Column "DatesB' are default (strings)

       DatesA      DatesB
0  20.10.2023  20.10.2023
1  21/10/2023  21/10/2023

The data type for the columns is a follows

DatesA    datetime64[ns]
DatesB            object

Although I didn't include it in the example code all the Dates in columns B and C are formatted as cell.number_format = 'dd.mm.yyyy' using the column loop.

When the dataframe is written to Excel, the 'DatesA' Column is interpreted as dates by Excel so they take on the format of the date 'dd.mm.yyyy' whether they were that originally or using forward slash as the separator.
The 'DatesB' column is string so its recognised by Excel as a string (you can see the orientation is to the left of the cell) and the number formatting does not change the date '21/10/2023' even though its applied.

enter image description here

2
Lala On

Problem was in existing excel file - it contained merged cells in the area where dataframe had to be saved.

Currently trying to make it save dates to excel in that format: dd.mm.yyyy without seconds.