My question is that if you have a string of DNA, how could you create a list of all possible consecutive triplets? For instance, if you have the following string:
ACCTAA
I need to create a list of all possible consecutive triplets, such that:
ACC, CCT, CTA, TAA
How could I accomplish that?
So far, I have only figured out how to create a list of triplets by dividing the string at equal intervals:
list_of_triplet = [dna[i:i+3] for i in range(0, len(dna), 3)]
Where dna
is the input string.
Thank you for any suggestions!
You're almost there. Let's remove the third parameter in the
range
function (you don't really want to split the string in groups of three). Also, we want to stop when there are only 3 characters left, so the second parameter should belen(dna) - 2
. With all this, you have:If you don't want the triplets to be repeated, you can instead use a set comprehension: