Add list as keys for another list and convert to dictionary

102 Views Asked by At

I have a list of columns and records that I get by using DATA-API RDS execute_statement using boto3. As the way the api responds it difficult to get the data out in a psycopg2 RealDictCursor format to insert into another database. What I want to do is illustrated below.

columns = ["a","b","c"]
records = [[1,2,3],[4,5,6]] 

I want to convert this to a dictionary that represents it like

[{"a":1,"b":2,"c":3},{"a":4,"b":5,"c":6}]
2

There are 2 best solutions below

4
On BEST ANSWER

Do it like this:

Python 3.8.9 (default, Aug  3 2021, 19:21:54) 
[Clang 13.0.0 (clang-1300.0.29.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> columns = ["a","b","c"]
>>> records = [[1,2,3],[4,5,6]] 
>>> [dict(zip(columns,rec)) for rec in records]
[{'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}]
>>> 

How it works

Work from the outside in. [...] means we'll produce a list. x for rec in records means we will evaluate the part in x once for every element in records. zip(columns,rec) zips together the column names and rec, which will be an element from records. So, zip(['a','b','c'],[1,2,3]) produces the list ('a',1), ('b',2), ('c',3), which are the things we want to build the dict from. And, if you pass a list of 2-tuples to the dictionary constructor, it is happy to create a dictionary from it. ('a',1) becomes {'a':1,...}

2
On

You can do it more efficiently as follows:

result = [dict(zip(columns, record)) for record in records]

what happens here is that we will loop through each of the inner lists in records that have 3 items as with columns. Then we perform zip() which converts two lists of equal lengths into something like a list of tuples called namedtuple, as an example, [(a, 1), (b, 2), (c, 3)] for the first record zipped with columns. This is then converted into a dictionary and stored in the result list. This goes on for each item in records. Hope this makes things clear to you.

Furthermore, this should be more efficient then running two for loops. This operation can be further optimized by using some advanced python modules.