Is there a way to get all combinations of overlap between values in dictionary, or from within a list

514 Views Asked by At

I have a few lists

A =['D','KIT','SAP'],
B= ['F','G','LUFT','SAP'],
C= ['I','SAP'],
D= ['SAP','LUF','KIT'],
F= ['SAP','LUF','KIT','HASS']

I passed them in combinations to a dictionary. For example, I joined lists A and B as a key 'AB' in my new dictionary and appended the values from both lists like the list of values to that key 'AB'.

my_dict={"AB":[['D','KIT','SAP'],['F','G','LUFT','SAP']]}

My aim to get the overlap between the lists in multiple combination. For that, I could get with the following one liner.

   for k,v in my_dict.items():
        print(k,list(set.intersection(*[set(x) for x in v])))

AB ['SAP']

However, the problem here is that when I have more lists and more combinations it is tedious to define manually a dictionary with all possible combinations.

The aim is to achieve all possible combinations of lists. From the above-mentioned example, I have the following combination of lists. In this case that would be 5*5 +1 = 26 combinations. And in the end, I would like to get the overlap for each of the following combinations.

   AB, AC, AD, AF
    ABC, ABD, ABF
    ABCD, ABCDF
    ABCDF
BC, BD, BF
DF, DC
FC

The question is how could I achieve that in a pythonic way. Any suggestions are much appreciated.

As a side note, I do not want a dictionary in between if this combination of overlap is possible from the lists itself.

Thanks

5

There are 5 best solutions below

0
On BEST ANSWER

Thanks for all solutions. Each solution was helpful. However, none of them was the exact answer I was looking for. Nonetheless, all of them partly helped me to come up with the right solution. Posting for everyone might be helpful.

I first needed to create 26 combinations for 5 lists. The initial lists

A =['D','KIT','SAP']
B= ['F','G','LUFT','SAP']
C= ['I','SAP']
D= ['SAP','LUF','KIT']
E= ['SAP','LUF','KIT','HASS']

Made a list of lists

lists = [A, B, C, D, E]

Defined empty list called combination as @marco's solution. And added all possible combinations based on the length of lists, plus one. The combinations are the values from our final dictionary.

 combinations =[]
    for l in range(2, len(lists) + 1):
        combinations += itertools.combinations(lists, l)

After that define a dictionary with keys and values from the list of the list. From here I adopted partly @butterflyknife's solution.

master_dict = dict(A=A, B=B, C=C, D=D, F=F)

For the keys of our final dictionary, we defined an empty list first. Then populated the combinations by using join.

key_list    = []
for i in range(2,len(master_dict) +1):
    for combo in it.combinations(master_dict.keys(), i):
        new_key = "".join(combo)
        if (len(combo)>1):
            key_list.append(new_key)

Now, with both keys and values. We populate our final dictionary.

final_dict = dict()
for key, elem in zip(key_list, combinations):
    final_dict[key] = list(elem)
   

This part is to find the overlaps and convert them to a data frame.

overlap_list  = []
dfs = []
for k,v in final_dict.items():
    overlap = list(set.intersection(*[set(x) for x in v]))
    overlap_list.append(overlap)
     #print(overlap, sep=", ")
    dct  ={ 'overlap':overlap,'number of overlapps':len(overlap),
              'overlap models':k}
    dfs.append(dct)
df          = pd.DataFrame.from_dict(dfs)
0
On

Have a look at itertools.combinations. It returns all possible combinations of a given length for a given iterable. You'll have to loop over all possible lengths.

import itertools

lists = [A, B, C, D, E, F]

combinations = []

for l in range(2, len(lists) + 1):
    combinations += itertools.combinations(lists, l)

combinations will be a list of tuples with all possible combinations of at least two lists.

4
On

[Edited solution after exchange in comments]

import itertools as it

A =['D','KIT','SAP'],
B= ['F','G','LUFT','SAP'],
C= ['I','SAP'],
D= ['SAP','LUF','KIT'],
F= ['SAP','LUF','KIT','HASS']

master_dict = dict(A=A, B=B, C=C, D=D, F=F)
for i in range(len(master_dict)):
    for combo in it.combinations(master_dict.keys(), i):
        new_key = "".join(combo)
        print(combo)
        if (len(combo)>1):
            mixed_dict[new_key] = set.intersection(*[set(master_dict[c][0]) for c in combo])
            print("---->" + str(mixed_dict[new_key]))

... with a couple of commented out print statements to show you what's going on. At the end of the script, you have a new_dict which looks like this:

enter image description here

I didn't add the combinations of length 1 on purpose, but you can remove the if() condition above to get them back.

0
On

Here is the dictionary creation with the answer of @Marco Breemhaar implemented, and I used his solution to create a dictionary as you wanted. Here is the code:

import itertools

import string
alphabet_string = string.ascii_uppercase
alphabet_list = list(alphabet_string)


A =['D','KIT','SAP']
B= ['F','G','LUFT','SAP']
C= ['I','SAP']
D= ['SAP','LUF','KIT']
E= ['SAP','LUF','KIT','HASS']

lists = [A, B, C, D, E]

combinations = []

for l in range(2, len(lists) + 1):
    combinations += itertools.combinations(lists, l)

alp_comb = []
for l in range(2, len(alphabet_list) + 1):
    alp_comb += itertools.combinations(alphabet_list[:len(lists)], l)

key_list = []
for alp_elem in alp_comb:
    key_name = ""
    for elem in alp_elem:
        key_name += elem
    key_list.append(key_name)
    key_name = None


res_dict = dict()
for key, elem in zip(key_list, combinations):
    res_dict[key] = list(elem)
print(res_dict)
1
On

To start solving the problem, I would turn the lists given to you into a dictionary. This is to easily reference the name of the list and the value with it.

Like what sagi commented, the combinations you listed does not give all combinations. So, I used the package itertools and the code from, https://stackoverflow.com/a/5898031/5131550, to get all possible combinations and slightly modified it to only give the results that are greater than one list long.

Then we need to add these to my_dict. I used a code snippet from geeksforgeeks, https://www.geeksforgeeks.org/python-program-to-convert-a-tuple-to-a-string/, to turn the tuple into a string that we use as a key for my_dict. At the same time we are building a list to add to those afformentioned keys.

Then we just use your function to find the interlapping elements. We should expect there are 2^(len(listsdict)) - len(listsdict) - 1, because we exclude no elements and only one element. Hope this solves it.

from itertools import combinations

#turn list into a dictionary of lists
listsdict = {
    'A' : ['D','KIT','SAP'],
    'B' : ['F','G','LUFT','SAP'],
    'C' : ['I','SAP'],
    'D' : ['SAP','LUF','KIT'],
    'F' : ['SAP','LUF','KIT','HASS']
}

my_dict = {}

#from https://www.geeksforgeeks.org/python-program-to-convert-a-tuple-to-a-string/
def convertTuple(tup):
    str = ''
    list = []
    for item in tup:
        str = str + item
        list += [listsdict[item]]
    my_dict[str] = list
    return str

#from https://stackoverflow.com/a/5898031/5131550
for L in range(0, len(listsdict)+1):
    for subset in combinations(listsdict, L):
        if len(subset) > 1:
            convertTuple(subset)

#OP's functions
for k,v in my_dict.items():
    print(list(set.intersection(*[set(x) for x in v])))