I have a bit of trouble implementing the BMH algorithm in C++.
Here's the code:
#define Nm 2000005
int D[256];
char To[Nm],P[Nm],*T;
int Tl,Pl;
int cont;
void initialize_Lenght()
{
Tl=strlen(To);
Pl=strlen(P);
T=To;
}
void compute_D()
{
for(int i=0;i<256;i++)
D[i]=Pl;
for(int i=0;i<Pl;i++)
D[P[i]]=Pl-i;
}
void Boyer_Moore()
{
int i;
while ( Tl>=Pl )
{
for(i=Pl-1;T[i]==P[i]&&i>=0;i--)
if(i==0)
{
if(cont<1000)
v[cont]=(T-To); // I also have to store the first 1000 values
cont++;
}
Tl -= D[T[i+1]];
T += D[T[i+1]];
}
}
It works for most examples but there are some examples that do not work (only ones which I found thus far are huge tests downloaded from different sources).
I would want to know where / what I am doing wrong (I do not really want code).
Edit: Due to comments
Do you have any idea how I could make this algorithm run faster without implementing a full Boyer-Moore version of it?
The order of tests in
is wrong. After a complete match, you compare
T[-1]
toP[-1]
before checking whether the index is admissible.If a mismatch occurs at the last pattern character,
skips according to a character that need not exist (if the pattern end is aligned with the text end).