I have been developing an antivirus using vb.net. The virus scanner works fine but I was thinking of ways to optimize the scanning speed (because large files take forever).
The algorithm I'm using to detect the viruses is via binary (converted to hex) signatures. I think I don't have to look around the whole file just to find if it's a virus or not, I think there's a specific place and a specific number of bytes that I should scan instead of scanning the whole file. Anyway, if anyone can provide any help in this subject please do so.
Thanks in advance.
BTW the virus signatures come from the hex collection from the clamAv antivirus...
Perhaps your pattern scan is inefficient. I can scan for a pattern in a 7 MB file in about 1/20th of a second using code like this. Note, if you really want to use code like this, you have to make a correction. You can't always set MatchedLength back to 0 when you realize that you aren't looking at a match, but it does work for this particular pattern. You have to pre-process the pattern so you know what to reset to when you don't find a match, but that will not add significant time to the algorithm. I could make the effort to correctly complete the algorithm, but I won't do that now if your question is just about performance. I'm just demonstrating that it is possible to scan large files quickly if you do it correctly.
EDIT
Here are some thoughts about how you could match multiple patterns at once. This is just off the top of my head and I have not tried to compile the code:
Create a class to contain information about the status of a pattern:
Declare a variable to track all the patterns that you need to check and index them by the first byte of the pattern for quick lookup:
Check all the patterns that are currently a potential match to see if the next byte also matches on these patterns; if not, stop looking at that pattern at that position:
See if the current byte looks like the beginning of a new pattern that you would be searching for; if so, add it to the list of active patterns: