I have two lists, each containing sublists in the form [chromosome, start_position, end_position]:
expos_list = [['1', '10', '30'], ['1', '50', '80'], ['1', '100', '200']]
pos_list = [['1', '12', '25'], ['1', '90', '98'], ['1', '130', '180'], ['2', '25', '50']]
I want to compare the sublists in 'pos_list' to the sublists in 'expos_list' and then add a 'pos_list' element to 'expos_list' if it is unique and/or is not contained within another expos_list element. So I would like my final output to be:
expos_list = [['1', '10', '30'], ['1', '50', '80'], ['1', '90', '98'], ['1', '100', '200'], ['2', '25', '50']]
...as this has only unique position ranges for each particular chromosome (chromosome = sublist[0] in both cases).
I have tried:
for expos_element in expos_list:
for pos_element in pos_list:
if pos_element[0] == expos_element[0]:
if pos_element[1] < expos_element[1]:
if pos_element[2] < expos_element[1]:
print("New")
elif pos_element[2] < expos_element[2]:
print("Overlapping at 3'")
else:
print("Discard")
elif expos_element[1] <= pos_element[1] < expos_element[2]:
if pos_element[2] <= expos_element[2]:
print("Discard")
else:
print("Overlapping at 5'")
else:
print("Hit is 3' of current existing element. Move on")
else:
print("Different chromosome")
Which obviously doesn't do the append to list bit etc, but is specifying whether there is an overlap of the elements. It does this, but compares all elements all the time, giving this output:
Discard
Hit is 3' of current existing element. Move on
Discard
Different chromosome
New
Hit is 3' of current existing element. Move on
New
Different chromosome
Overlapping at 5'
Hit is 3' of current existing element. Move on
Discard
Different chromosome
This gives 12 lines of output (rather than the desired one line per sublist in pos_list). I'm really struggling to get this working. I guess my ideal output for the above run would be:
Discard
New
Discard
Different chromosome
Any help would be greatly appreciated. Thanks!
If you're not that interested in how each item overlaps, simplify the code to your three cases (discard, new, different):
When I run it, I see: