I have a list of strings, each having one or more words. I need to make a list of unique words out of this list. I can do it easily with two nested loops but I don't understand why I don't get the same result using a set comprehension.
Nested loop:
import re
items = ['17th C White', 'Accra White', 'Acid White']
word_list = set()
for item in items:
for word in re.split("\s|[-'/]", item):
word_list.add(word)
print(word_list)
Result from the nested loop (correct):
{'White', 'Acid', 'Accra', '17th', 'C'}
Set comprehension:
import re
items = ['17th C White', 'Accra White', 'Acid White']
word_list = {word for word in re.split("\s|[-'/]", item) for item in items}
print(word_list)
Result from the set comprehension (incorrect):
{'White', 'Acid'}
Why I don't get the same result from the set comprehension?
Your set comprehension is not quite right.
You have:
You mean:
The first version is using the previous value of
item, which must be"Acid White", in the expressionfor word in re.split("\s|[-'/]", item).Where you have multiple
forparts in one comprehension, you should place the one defining a variable (item), before the one using the variable (re.split("\s|[-'/]", item)).