I am hashing a file including its name from the path component with the code shown below1. I am getting these strange results (hash value simplified for readability):
| Machine | Hash |
|---|---|
| Windows 10 - local 1 | abc |
| Windows 10 - local 2 | abc |
| Windows 10 - CI - Machine 1 | abc |
| Windows 10 - CI - Machine 2 | abc |
| Linux RHEL 7 - local | abc |
| Linux RHEL 7 - CI | abc |
| Linux CentOS 7.9 - local | abc |
| Linux CentOS 7.9 - CI | abc_linux_other |
The interesting thing is
- it always works on Windows, doesn't matter the machine, the user, the patch level of windows etc.
- on RHEL 7 it works, on my local machine, on the CI machine, and if I ssh in the CI machine and run it locally (as different user), it always works.
- and then there is CentOS, where in CI, I get these strange results.
On all machines, the same Python version, even the same Python installation is used!
I have absolutly no clue, where this behavior comes from and where I to look?
1 Code
import hashlib
from pathlib import Path
MAX_READ = 4096
some_file = Path("abc/abc.txt")
cs = hashlib.sha256()
buffer = some_file.name.encode("utf-8")
cs.update(buffer)
with open(some_file, "rb") as f:
while buffer:
buffer = f.read(MAX_READ)
cs.update(buffer)