I have to save the combination of lastname, firstname and birth-date of a person as a hash. This hash is later used to search for the same person with the exactly same properties. My question is, if SHA-1 is a meaningfull algorithm for this.
As far as I understand SHA-1, there is virtually no possibility that two different persons (with different attributes) will ever get the same hash-value. Is this right?
If you want to search for a person knowing only those credentials, you could store the SHA-1 in the database(or MD5 for speed, unless you have like a quadrillion people to sample).
The hash will be worthless, as it stores no information about the person, but it can work for searching a database. You just want to make sure that the three pieces of information match, so it would be safe to just concatenate them:
And when you query, you could check if the two match:
I put
query.DOB
in the middle because the first and last name might collide, like ifJohnDoe Bob
was born on the same day asJohn DoeBob
. I'm not aware of numeric names, so I think this will stop collisions like those ;)But if this is a big database, I'd try MD5. It's faster, but there is a chance of a collision (in your case, I can guarantee that one won't occur). The chance of a collision, however, is really small.
To put that into perspective, a collision is a
1 / 2^128
occurrence, which is:And that's a little smaller than:
I'm pretty sure you're not going to get a collision ;)