Problems joining two CSV tables with ID and exporting as nested JSON

51 Views Asked by At

Problems joining two CSV Tables over ID and exporting as a nested JSON

Hey absolute Python beginner here. I have two tables; one containing Information over a findplace of archeological finds with a ID; then i have a second table with many coins (80.000) linked with the ID to the findplaces. I wanted to create a nested structure, where every coind of a Findplace is linked inside a nested part of my findplace (Fundorte).

I want to create a JSON, so that i can implement geodata in the next step and create a nested GeoJason from this dataset. Afterwards i wanted to use the Geodata in Leaflet. The code used below:

Fundorte = open(r"C:\Users\LordK\Desktop\Daten Krmnicek\Versuch JSON STruktur\FundorteID.csv", 'r')
jsonfile = open(r"C:\Users\LordK\Desktop\Daten Krmnicek\Versuch JSON STruktur\Skript\out.json", 'w')
Münzen = open(r"C:\Users\LordK\Desktop\Daten Krmnicek\Versuch JSON STruktur\MünzenID.csv", 'r')


FuFelder = ["ID","FOverwendbar","x_KBS_25832","y_KBS_25832","PLZ"]
MüFelder = ["ID_Münze","Nr","Nominal","Prägeherr","RIC (vollständige Ansprache) (aktuell)", "Rs-Typ", "Quelle", "Zitat", "Detail 1", "Detail 2", "SF_EF_Börse", "FOverwendbar", "PLZ", "Kontext", "ID"]

FundorteReader = csv.DictReader( Fundorte, fieldnames=FuFelder)
MünzenReader = csv.DictReader( Münzen, fieldnames=MüFelder)

output=[]
for Fundrow in FundorteReader:
    Fundrow['Muenzen'] = []
    output.append(Fundrow)
    Münzen.seek(0)
    for itemrow in MünzenReader:
        if(itemrow["ID"]==Fundrow["ID"]):
            Fundrow['Muenzen'].append(itemrow)

json.dump(output, jsonfile,sort_keys=True)
jsonfile.close()
print("done")

Right now i get the findplaces FuFelder displayed, but the nested list Fundrow["Münzen"] ist still empty. What am i doing wrong?

1

There are 1 best solutions below

1
On

You are updating output before you find necessary itemrow from MünzenReader. Move

output.append(Fundrow)

after your cycle:

output=[]
for Fundrow in FundorteReader:
    Fundrow['Muenzen'] = []
    Münzen.seek(0)
    for itemrow in MünzenReader:
        if(itemrow["ID"]==Fundrow["ID"]):
            Fundrow['Muenzen'].append(itemrow)
    output.append(Fundrow)

For the future working with csv files I recommend using pandas library. Your task could be accomplished in pandas much faster as it has built-in join function:

import pandas as pd

path_1 = r"C:\Users\LordK\Desktop\Daten Krmnicek\Versuch JSON STruktur\FundorteID.csv"
path_json = r"C:\Users\LordK\Desktop\Daten Krmnicek\Versuch JSON STruktur\Skript\out.json"
path_2 = r"C:\Users\LordK\Desktop\Daten Krmnicek\Versuch JSON STruktur\MünzenID.csv"

FuFelder = ["ID","FOverwendbar","x_KBS_25832","y_KBS_25832","PLZ"]
MüFelder = ["ID_Münze","Nr","Nominal","Prägeherr","RIC (vollständige Ansprache) (aktuell)", "Rs-Typ", "Quelle", "Zitat", "Detail 1", "Detail 2", "SF_EF_Börse", "FOverwendbar", "PLZ", "Kontext", "ID"]

df1 = pd.read_csv(path_1, names=FuFelder)
df2 = pd.read_csv(path_2, names=MüFelder)

joined_df = df1.join(df2, on="ID")
joined_df.to_json(path_json)

I haven't tested this code as I don't have your csv-files but it gives a direction to dig.