Pandas json_normalize nullable parent keys

71 Views Asked by Tyn At 27 May 2023 at 22:03

I am ingesting a JSON API. A simplified version as follows:

json_data = [
    {"id": 1, "owner": None}, 
    {"id": 2, "owner": {"email": "[email protected]"}}
]

df = pd.json_normalize(json_data, sep="_")
print(df)

   id  owner  owner_email
0   1    NaN          NaN
1   2    NaN  [email protected]

In API response json_data, the field "owner" can be an object or null. When I create a pandas dataframe with json_normalize, it creates 2 columns for owner where it should have been just one owner_email column.

The simpliest solutions I have is:

To replace null owner records with dummy records before I create dataframe with json_normalize.

"owner": null --> "owner": { "email": "" }
Simply drop the useless "owner" column from the dataframe afterwards.

I was wondering if there is a more elegant solution, e.g. telling Pandas in json_normalize parameters that it shouldn't create another column in this kind of situation. As far as I read the Pandas docs, I couldn't find such option.

Any suggesions would be much appreciated.

Environment

Debian 11
Python 3.9.2
Pandas 1.5.3

Original Q&A

Pandas json_normalize nullable parent keys

Environment

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in JSON

Related Questions in PANDAS

Related Questions in JSON-NORMALIZE

Trending Questions

Popular # Hahtags

Popular Questions