How can I write a code to find the most frequent 2-mer of "GATCCAGATCCCCATAC". I have written this code but it seems that I am wrong, please help in correcting me.
def PatternCount(Pattern, Text):
count = 0
for i in range(len(Text)-len(Pattern)+1):
if Text[i:i+len(Pattern)] == Pattern:
count = count+1
return count
This code prints the most frequent k-mer in a string but it don't give me the 2-mer in the given string.
If you want a simple approach, consider a sliding window technique. An implementation is available in more_itertools, so you don't have to make one yourself. This is easy to use if you
pip install more_itertools
.Simple Example
The above example demonstrates what little is required to get most of the information you want using
windowed
andCounter
.Description
A "window" or container of length
k=2
is sliding across the sequence one stride at a time (e.g.step=1
). Each new group is added as a key to theCounter
dictionary. For each occurrence, the tally is incremented. The finalCounter
object primarily reports all tallies and includes other helpful features.Final Solution
If actual string pairs is important, that is simple too. We will make a general function that groups the strings and works for any k mers: