There are two ways of implementing CRC generation with linear feedback shift registers (LFSR), as shown in this figure . The coefficients of generator polynomial in this picture are 100111, and the red "+" circles are exclusive-or operators. The initialization register values are 00000 for both.
For example, if the input data bit stream is 10010011, both A and B will give CRC checksum of 1010. The difference is A finishes with 8 shifts, while B with 8+5=13 shifts because of the 5 zeros appended to the input data. I can understand B very easily since it closely mimics the modulo-2 division. However, I can not understand mathematically how A can give the same result with 5 less shifts. I heard people were talking A took advantage of the pre-appending zeros, but I didn't get it. Can anyone explain it to me? Thanks!
You can say that architecture (A) is implementing the modulo division by aligning MSB of the polyn with MSB of Message, so it is implementing something like the following (in my example I have another crc polyn actually):
But in Architecture (B), you can say we try to predict the MSB of the Message, so we align MSB of CRC polyn with MSB-1 of the message, something like the following:
I can recommend details about this operation in this tutorial