set comprehension vs. nested loop

203 Views Asked by At

I have a list of strings, each having one or more words. I need to make a list of unique words out of this list. I can do it easily with two nested loops but I don't understand why I don't get the same result using a set comprehension.

Nested loop:

import re

items = ['17th C White', 'Accra White', 'Acid White']

word_list = set()
for item in items:
    for word in re.split("\s|[-'/]", item):
        word_list.add(word)
print(word_list)

Result from the nested loop (correct):

{'White', 'Acid', 'Accra', '17th', 'C'}

Set comprehension:

import re

items = ['17th C White', 'Accra White', 'Acid White']

word_list = {word for word in re.split("\s|[-'/]", item) for item in items}
print(word_list)

Result from the set comprehension (incorrect):

{'White', 'Acid'}

Why I don't get the same result from the set comprehension?

1

There are 1 best solutions below

1
khelwood On

Your set comprehension is not quite right.

You have:

{word for word in re.split("\s|[-'/]", item) for item in items}

You mean:

{word for item in items for word in re.split("\s|[-'/]", item)}

The first version is using the previous value of item, which must be "Acid White", in the expression for word in re.split("\s|[-'/]", item).

Where you have multiple for parts in one comprehension, you should place the one defining a variable (item), before the one using the variable (re.split("\s|[-'/]", item)).