I am trying to compute md5 hash of a file with the function hashlib.md5() from hashlib module.
So that I writed this piece of code:
Buffer = 128
f = open("c:\\file.tct", "rb")
m = hashlib.md5()
while True:
p = f.read(Buffer)
if len(p) != 0:
m.update(p)
else:
break
print m.hexdigest()
f.close()
I noted the function update is faster if I increase Buffer variable value with 64, 128, 256 and so on. There is a upper limit I cannot exceed? I suppose it might only a RAM memory problem but I don't know.
Big (≈
2**40
) chunk sizes lead toMemoryError
i.e., there is no limit other than available RAM. On the other handbufsize
is limited by2**31-1
on my machine:Big
chunksize
can be as slow as a very small one. Measure it.I find that for ≈
10
MB files the2**15
chunksize
is the fastest for the files I've tested.