How do I convert the json into a dataframe and ignore the top info?

Question

How do I convert the json into a dataframe and ignore the top info?

973 Views Asked by Sinead Hayes At 16 June 2025 at 11:56

I'm trying to convert some Strava activity data into a df to do some analysis.

The typical top of json looks like this in all files:

{
    "frame_type": "data_message",
    "name": "record",
    "header": {
        "local_mesg_num": 3,
        "time_offset": null,
        "is_developer_data": false
    },
    "fields": [
        {
            "name": "xx",
            "value": "xx",
            "units": "xx",
            "def_num": xx,
            "raw_value": xx
        }
    ]
}

But there is a load of information above it which I am not sure how to filter out.

I'm trying to do df = pandas.json_normalize(data) but it returns this:

Essentially I want the table output to be:

timestamp	position_lat	position_long	distance	time_from_course	etc
first	row	data	more	columns too

etc.

I am relatively new to this all so apologies for the noob question...

Original Q&A

There are 1 best solutions below

**errorLogger** · Answer 1

I don't think you can accomplish what you need by directly calling a pandas function. You may have to reformat your data beforehand. The JSON module can help you manipulate your original data, treated as native Python lists and dicts, to define and populate a second "cleaned up" dictionary that you can then pass in pandas.DataFrame.

Starting from the string representation of the JSON file s, you can convert it into a list of dictionaries.

contents = json.loads(s)

Your data3 variable seems to be in this format already. It is a list of dictionaries.

If I understand correctly, you want your dataframe columns to correspond to the 'name' fields, and the rows to be populated with the 'value' fields. data3 contains many dictionaries, so each will correspond to a row. To develop working code for you, I created a silly example that I hope emulates what you have well.

s = """
[
    {
    "frame_type": "data_message",
    "name": "record",
    "header": {
        "local_mesg_num": 3,
        "time_offset": null,
        "is_developer_data": false
    },
    "fields": [
        {
            "name": "x1",
            "value": "xx",
            "units": "xx",
            "def_num": 1,
            "raw_value": 1
        }, 
        {
            "name": "x2",
            "value": "xx",
            "units": "xx",
            "def_num": 2,
            "raw_value": 2
        },
        {
            "name": "x3",
            "value": "xx",
            "units": "xx",
            "def_num": 3,
            "raw_value": 3
        }
    ]
},
   {
    "frame_type": "data_message",
    "name": "record",
    "header": {
        "local_mesg_num": 3,
        "time_offset": null,
        "is_developer_data": false
    },
    "fields": [
        {
            "name": "x1",
            "value": "xx",
            "units": "xx",
            "def_num": 1,
            "raw_value": 1
        }, 
        {
            "name": "x2",
            "value": "xx",
            "units": "xx",
            "def_num": 2,
            "raw_value": 2
        },
        {
            "name": "x3",
            "value": "xx",
            "units": "xx",
            "def_num": 3,
            "raw_value": 3
        }
    ]
}
,   {
    "frame_type": "data_message",
    "name": "record",
    "header": {
        "local_mesg_num": 3,
        "time_offset": null,
        "is_developer_data": false
    },
    "fields": [
        {
            "name": "x1",
            "value": "xx",
            "units": "xx",
            "def_num": 1,
            "raw_value": 1
        }, 
        {
            "name": "x2",
            "value": "xx",
            "units": "xx",
            "def_num": 2,
            "raw_value": 2
        },
        {
            "name": "x3",
            "value": "xx",
            "units": "xx",
            "def_num": 3,
            "raw_value": 3
        }
    ]
}]
 """

data3 = json.loads(s)  # emulating your data3 variable

Instantiate an empty dict and start filling in the things you want. This dict will be passed to pd.DataFrame().

reformatted_data = {}  # instantiate

First, use the 1st dictionary to define the columns of the dataframe.

for thing in data3[0]:
    for field in thing["fields"]:
        reformatted_data[field["name"]] = []  # instantiate columns

Then populate the rows

for thing in data3:
    for field in thing["fields"]:
        reformatted_data[field["name"]].append(field["value"])

Finally, obtain the dataframe

df = pd.DataFrame(reformatted_data)

    x1  x2  x3
0   xx  xx  xx
1   xx  xx  xx
2   xx  xx  xx

How do I convert the json into a dataframe and ignore the top info?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in JSON

Related Questions in DATAFRAME

Related Questions in STRAVA

Trending Questions

Popular # Hahtags

Popular Questions