Convert Pandas Dataframe to nested json-keep 2 columns

273 Views Asked by K.Liang At 09 August 2022 at 10:06

I have a DF with the following columns and data:

enter image description here

I hope it could be converted to two columns, studentid and info, with the following format.

enter image description here

the dataset is """

studentid   course  teacher grade   rank
1   math    A   91  1
1   history B   79  2
2   math    A   88  2
2   history B   83  1
3   math    A   85  3
3   history B   76  3

and the desire output is

studentid   info
1   "{""math"":[{""teacher"":""A"",""grade"":91,""rank"":1}],
""history"":[{""teacher"":""B"",""grade"":79,""rank"":2}]}"
2   "{""math"":[{""teacher"":""A"",""grade"":88,""rank"":2}],
""history"":[{""teacher"":""B"",""grade"":83,""rank"":1}]}"
3   "{""math"":[{""teacher"":""A"",""grade"":85,""rank"":3}],
""history"":[{""teacher"":""B"",""grade"":76,""rank"":3}]}"

Original Q&A

There are 2 best solutions below

Celius Stingher On 09 August 2022 at 10:13

You don't really need groupby() and the single sub-dictionaries shouldn't really be in a list, but as value's for the nested dict. After setting the columns you want as index, with df.to_dict() you can achieve the desired output:

df = df.set_index(['studentid','course'])

df.to_dict(orient='index')

Outputs:

{(1, 'math'): {'teacher': 'A', 'grade': 91, 'rank': 1},
 (1, 'history'): {'teacher': 'B', 'grade': 79, 'rank': 2},
 (2, 'math'): {'teacher': 'A', 'grade': 88, 'rank': 2},
 (2, 'history'): {'teacher': 'B', 'grade': 83, 'rank': 1},
 (3, 'math'): {'teacher': 'A', 'grade': 85, 'rank': 3},
 (3, 'history'): {'teacher': 'B', 'grade': 76, 'rank': 3}}

Gonçalo Peres On 25 November 2022 at 13:10

Considering that the initial dataframe is df, there are various options, depending on the exact desired output.

If one wants the info column to be a dictionary of lists, this will do the work

df_new = df.groupby('studentid').apply(lambda x: x.drop('studentid', axis=1).to_dict(orient='list')).reset_index(name='info')

[Out]:

   studentid                                               info
0          1  {'course': ['math', 'history'], 'teacher': ['A...
1          2  {'course': ['math', 'history'], 'teacher': ['A...
2          3  {'course': ['math', 'history'], 'teacher': ['A...

If one wants a list of dictionaries, then do the following

df_new = df.groupby('studentid').apply(lambda x: x.drop('studentid', axis=1).to_dict(orient='records')).reset_index(name='info')

[Out]:

   studentid                                               info
0          1  [{'course': 'math', 'teacher': 'A', 'grade': 9...
1          2  [{'course': 'math', 'teacher': 'A', 'grade': 8...
2          3  [{'course': 'math', 'teacher': 'A', 'grade': 8...

Convert Pandas Dataframe to nested json-keep 2 columns

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in TO-JSON

Trending Questions

Popular # Hahtags

Popular Questions