I need to perform a well-known Burrows-Wheeler Transform in linear time. I found a solution with suffix sorting and EOF character, but appending EOF changes the transformation. For example: consider the string bcababa
and two rotations
- s1 =
abababc
- s2 =
ababcab
it's clear that s1 < s2. Now with an EOF character:
- s1 = ababa#bc
- s2 = aba#bcab
and now s2 < s1. And the resulting transformation will be different. How can I perform BWT without EOF?
You need to have EOF character in the string for BWT to work, because otherwise you can't perform the inverse transform to get the original string back. Without EOF, both strings "ba" and "ab" have the same transformed version ("ba"). With EOF, the transforms are different
i.e. ab transforms to "|ab" and ba to "b|a".
EOF is needed for BWT because it marks the point where the character cycle starts.
Re: doing it without the EOF character, according to Wikipedia,