XLSX file extraction in Palantir Foundry

116 Views Asked by Murali krishna Ps At 18 August 2025 at 05:07

Below is my code I'm trying extract XLSX file. Please let know if there any other methods to extract XLSX files in Palantir Foundry Code Repos.

def compute(source_df, output_df, ctx):
    filestatus = list(source_df.filesystem().ls(glob='**/*.xlsx'))
    assert(len(filestatus) == 1)
    latest_file = filestatus[0]
    print(latest_file)
    # rows = []
    with source_df.filesystem().open(latest_file.path, 'rb') as f:
        wb = openpyxl.load_workbook(f, read_only=True)
        ws = wb['Sheet1']
        headers = ws.row_values(1)
        rows = []
        for row in ws.rows[2:]:
            row_dict = {}
            for i in range(len(headers)):
                row_dict[headers[i]] = row[i].value
        rows.append(row_dict)
    df = ctx.spark_session.createDataFrame(rows, schema)
    output_df.write_dataframe(df)

Getting this error. How to resolve it?

[module version: 1.913.0]

zipfile.BadZipFile: File is not a zip file

Original Q&A

There are 1 best solutions below

NicPWNs On 06 December 2023 at 17:55

First, I'd recommend taking a look at the following YouTube video published by Palantir. This may be a better method using xlrd:

Code Repositories | How to Parse Excel Files into a Usable Dataset in Palantir Foundry

Regarding the error using openpyxl in your existing code, this indicates that there is something wrong with your XLSX file. It could be that your XLSX file is empty, password protected, or corrupted in some way. Be sure to open it in Microsoft Excel to check that the spreadsheet looks normal.

XLSX file extraction in Palantir Foundry

There are 1 best solutions below

Related Questions in PALANTIR-FOUNDRY

Related Questions in PALANTIR-FOUNDRY-API

Related Questions in PALANTIR-FOUNDRY-SECURITY

Related Questions in PAYPAL-ANDROID-SDK

Related Questions in PALANTIR-FOR-IBM-CLOUDPAK-FOR-DATA

Trending Questions

Popular # Hahtags

Popular Questions