Converting data from .csv file to .json - Python

17k Views Asked by At

I need to convert data from my csv file to the one i am gonna use which is .js.

Lp.;Name;Surname;Desc;Unit;Comment
1;Jan;Makowski;Inf;km;
2;Anna;Nowak;Pts;km;Brak reakcji

If you can see column 'comment' does not always have record and I need to keep it that way. Also between data there is amount of tabs I need to set as well.

enter image description here

I've a file,i am working on right now but It show's me data in row like :

[{"Lp.;Name;Surname;Desc;Unit;Comment": "1;Jan;Makowski;Inf;km;"}, {"Lp.;Name;Surname;Desc;Unit;Comment": "2;Anna;Nowak;Pts;km;Brak reakcji"...]

I am new to python and I have no idea how to define what I need to get.

@@ Edit I managed to do that...

import json
import csv

# Deklaracja danych
fieldnames = ("Lp.", "Name", "Surname", "Desc", "Unit", "Comment")

#  Otwieranie plików
with open('file.csv', 'r', encoding = "utf8") as csvfile:
    reader = csv.DictReader(csvfile) # ,fieldnames)
    rows = list(reader)

# Zamykamy plik
csvfile.close()

# Tworzymy plik z danych
with open('file.json', 'w', encoding = "utf8") as jsonfile:
    json.dump(rows,jsonfile)
    # jsonfile.write(s.replace(';', '/t'))
# Zamykamy plik
csvfile.close()
5

There are 5 best solutions below

8
On BEST ANSWER

I think this is your answer, this may not be the best way to do it, but it can gives you the result.

import csv
import json
with open('file.csv', 'r') as f:
    reader = csv.reader(f, delimiter=';')
    data_list = list()
    for row in reader:
        data_list.append(row)
data = [dict(zip(data_list[0],row)) for row in data_list]
data.pop(0)
s = json.dumps(data)
print (s)

output:

[{"Comment": "", "Surname": "Makowski", "Name": "Jan", "Lp.": "1", "Unit": "km", "Desc": "Inf"}, {"Comment": "Brak reakcji", "Surname": "Nowak", "Name": "Anna", "Lp.": "2", "Unit": "km", "Desc": "Pts"}]
1
On

Pandas has both built-in .read_csv() and .to_json(). As an intermediate you get then a dataframe with which you can manipulate the data, including defining an index or data model.

import pandas as pd
df = pd.read_csv('file.csv')
# any operations on dataframe df
df.to_json('file.json')
1
On

A very quick example of solution:

import csv

output = []
with open('your_file.csv', 'rb') as csvfile:
    reader = csv.DictReader(csvfile)
    output = [item for item in reader]
json.dump(open('your_file.json'), output)
5
On

You can minimize your script quite a lot given a few details of python:

import csv
import json

with open('file.csv', 'r', encoding='utf8') as csvfile:
    with open('file.json', 'w', encoding='utf8') as jsonfile:
        reader = csv.DictReader(csvfile, delimiter=';')
        json.dump(list(reader), jsonfile)

The details about why this works:

  1. Files opened as part of with-statements are closed automatically when the block is "done". (The corresponding PEP: https://www.python.org/dev/peps/pep-0343/)
  2. The list constructor can take Iterators (such as a csv.DictReader) as argument. (DictReader documentation: https://docs.python.org/3.7/library/csv.html#csv.DictReader)

This will create a list in memory with a dict for each row in the csv-file which might be worth keeping in mind if you are handling very large files. Sadly it isn't possible as far as I know to send the iterator directly to the json serializer without modifying the iterator or the serializer.

I'll leave the argument about which solution that is better from a maintenance and a readability perspective as an exercise for the reader.


If you're not reading from a local file on disk, but from another program via stdin you don't need to open the input file:

import csv
import json
import sys

with open('file.json', 'w', encoding='utf8') as jsonfile:
    reader = csv.DictReader(sys.stdin, delimiter=';')
    json.dump(list(reader), jsonfile)

The same goes for outputting to stdout via print() or json.dump(..., sys.stdout).

0
On

Rather than writing a script, you could try the csvjson command line tool (written in Python):

http://csvkit.readthedocs.io/en/1.0.2/scripts/csvjson.html?highlight=csvjson