I have a file with sample JSON output looks as below: jsonoutput.txt file:
[{"fruit": "orange", "id":1, "countries": ["Portugal"], "color": "Orange"}
{"fruit": "apple", "id":2, "countries": ["Portugal"], "color": "red"}]
I need the output csv as (excel file):
fruit id countries color
orange 1 Portugal Orange
apple 2 Spain red
Now, I am getting as fruit id countries color orange 1 [u'Portugal'] Orange apple 2 [u'Spain'] red
How can I remove [] , u and '' from the column countries?
print (json.dumps(fruits))
--gives me in json output
This is what I have tried to convert json to xlsx:
data= tablib.Dataset(headers=('Fruit','id','Countries','Color'))
importfile = 'jsonoutput.txt'
data.json = open(importfile. 'r').read()
data_export = data.export('xlsx')
with open('output.xlsx','wb') as f:
f.write(data_export)
f.close()
you could use
pandas.io.json.json_normalize
to save it as
.xlsx
file, use:EDIT:
json_normalize
is a function that normalizes semi-structured JSON data into a flat table.I actually realised now that my code could be simplified to:
To remove
[]
fromcountries
column, I usedpandas.Series.str.join
, which ispandas
' equivalent to Python'sstr.join
.It's needed since initially
countries
column is a list containing elementscountries
column is no longer a list once you join items: