binary format that allows to store multiple pandas dataframes with different columns, width, rows

88 Views Asked by Abdulrahman Sheikho At 17 August 2025 at 14:25

I have like 200 pandas dataframe, and every dataframe has some unique column, or maybe completely different columns. example:

df1 = pd.DataFrame({
    'Product': ['Apple', 'Banana', 'Orange', 'Mango'],
    'Quantity': [10, 15, 12, 8],
    'Price': [2.5, 1.5, 2, 3],
    'Category': ['Fruit', 'Fruit', 'Fruit', 'Fruit']
})
df2 = pd.DataFrame({
    'Student Name': ['John', 'Emma', 'Lisa', 'Tom'],
    'Age': [18, 17, 19, 18],
    'Grade': ['A', 'B', 'A', 'B'],
    'City': ['New York', 'London', 'Paris', 'Sydney']
})
df3 = pd.DataFrame({
    'Date': ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04'],
    'Company': ['AAPL', 'GOOG', 'AMZN', 'MSFT'],
    'Price': [132.69, 1760.33, 3187.50, 215.41]
})
# and many more

while I thought that I can easily jump into Parquet and make a one folder, this turned out that it doesn't work that way if the Parquet files has different schemas (I haven't implemented it, so maybe I'm wrong too)

obviously I have read this post Storing multiple dataframes of different widths with Parquet?

so what are some of the formats that allow storing multiple dataframes in one file? other that excel

note: I'm trying to look into to_orc() and orc format, but I don't know if I can merge different schemas and cutoff NA values.

note2: maybe it's not an answerable question, but you can help with sharing topics and links.

Original Q&A

There are 1 best solutions below

Corralien On 04 November 2023 at 07:14

so what are some of the formats that allow storing multiple dataframes in one file? other that excel

You can use HDF5. Install pytables first with pip install tables

with pd.HDFStore('dataframes.hdf') as store:
    df1.to_hdf(store, key='df1')
    df2.to_hdf(store, key='df2')
    df3.to_hdf(store, key='df3')

Check:

store = pd.HDFStore('dataframes.hdf')

>>> store.keys()
['/df1', '/df2', '/df3']

>>> print(store.info())
<class 'pandas.io.pytables.HDFStore'>
File path: dataframes.hdf
/df1            frame        (shape->[4,4])
/df2            frame        (shape->[4,4])
/df3            frame        (shape->[1,3])

binary format that allows to store multiple pandas dataframes with different columns, width, rows

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in PARQUET

Related Questions in ORC

Trending Questions

Popular # Hahtags

Popular Questions