In python, I have a list of tuples (lot) with patient data, as shown below:
lot = [('490001', 'A-ARM1', '1', '2', "a", "b"),
('490001', 'A-ARM2', '3', '4', "c", "d"),
('490002', 'B-ARM3', '5', '6', "e", "f")]
In my real dataset, lot consists of 50-150 tuples (dependent on the patient). I loop through every second tuple element and wish to replace every 'A-' and 'B-' characters by a dictionary value, so the output will become:
[('490001', 'ZZARM1', '1', '2', 'a', 'b'), ('490001', 'ZZARM2', '3', '4', 'c', 'd'), ('490002', 'XXARM3', '5', '6', 'e', 'f')]
To satisfy this, I've written the code below. Here, I was wondering if there is a cleaner (shorter) way of writing this. For example, 'lot2'. The code should work optimally for a large list of tuples, as stated above. I'm eager to learn from you!
from more_itertools import grouper
dict = {'A-': 'ZZ', 'B-': 'XX'}
for el1, el2, *rest in lot:
for i, j in grouper(el2, 2):
if i + j in dict:
lot2 = [ ( tpl[0], (tpl[1].replace(tpl[1][:2], dict[tpl[1][:2]])), tpl[2], tpl[3], tpl[4], tpl[5] ) for tpl in lot]
print(lot2)
If you're looking for a shorter code, here's a shorter code that doesn't used
more_itertools.grouper
. Basically, iterate overlot
and modify second elements as you go (if it needs to be changed). Note that I nameddict
todct
here;dict
is the builtin dict constructor, naming your variables the same as Python builtins create problems if you happen to want to use the dict constructor later on.which can be written even more concisely:
Output: