How to Combine Two Referential Lists With Python

44 Views Asked by At

I have stupid data coming out of a system, it needs to be flattened.

The main csv has these columns: hostname, program_name, version_name

However, there is only one row per host, so the last two fields look like this:

program_name contents:

Word
Excel
Cognos
Mozilla

version contents (not real, just for illustrative purposes):

2.3.2
121.3.0
build 22

What's the best way to ensure things match up and to more concisely and pythonically do this.

Here is what the real code looks like, the above is mainly for demo purposes:

for row in tan_output.programs:
    names = row["Name"].splitlines()
    versions = row["Version"].splitlines()
    if(len(names) != len(versions)):
        print("NAME and VERSION from tan_programs are not equal... Exiting")
        exit()
    else:
        for name in names:
            #tan_programs.append({"Count": row["Count"], "Hostname": row["Hostname"], "Name": row["Name"], "Version": row["Version"]})

I am stuck on the bottom for loop because I feel like I should be looping thru both lists simultaneously instead of looping thru one, and then what I was going to do, use a counter to reference the second one and form the flattened data.

PS, the file is 7 gigs... so the more efficient the better e.g. if I have to use the counter, I know from experience i += 1 is 100 times more efficient than i = i + 1

1

There are 1 best solutions below

0
On

Just use the Counter... unless someone has a better idea:

tan_programs = []
for row in tan_output.programs:
    names = row["Name"].splitlines()
    versions = row["Version"].splitlines()
    if(len(names) != len(versions)):
        print("NAME and VERSION from tan_programs are not equal... Exiting")
        exit()
    else:
        i = 0
        for name in names:
            tan_programs.append({"Hostname": row["Hostname"], "Name": name, "Version": versions[i]})
            i += 1

Its actually very fast... the slow part is inserting 8 million records into a DB on another server over the network.