Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
What's wrong with CRC32
Message
De
09/01/2014 16:39:46
 
 
À
09/01/2014 10:51:02
Information générale
Forum:
ASP.NET
Catégorie:
Autre
Versions des environnements
Environment:
VB 9.0
OS:
Windows 7
Network:
Windows 2003 Server
Database:
MS SQL Server
Application:
Web
Divers
Thread ID:
01591437
Message ID:
01591593
Vues:
61
This message has been marked as a message which has helped to the initial question of the thread.
>>As you do run significant numbers (50 mill) be sure to think about hash collisions - ALL hash functions, CRC included, have information loss which might result in false positives. Very early in computing (disc space being VERY costly then) I had built a structure identifying duplicates via 3 different hashes taken together to form the key and even then check for excact duplication and increment trailing integer in case of collisions...
>
>You could explain more about "information loss" and "false positives"?
>
>It seemed that the hash method was a more sufficient method to obtain an ID on the file. Are you saying that doing it twice might result in a different value sometimes?

All you need to do is process the contents of your files once, with a suitable cryptographic hash function, and store the resulting digests: http://en.wikipedia.org/wiki/Cryptographic_hash_function

If you don't need high security MD5 is fine. SHA is better. Even SHA-1 with a 160 bit digest has 2^160 possible digest values, which is over 10^48. The chances of collisions in 50 million (5x 10^7) files is vanishingly small; if you actually got one you should notify the crypto community (not kidding).
Regards. Al

"Violence is the last refuge of the incompetent." -- Isaac Asimov
"Never let your sense of morals prevent you from doing what is right." -- Isaac Asimov

Neither a despot, nor a doormat, be

Every app wants to be a database app when it grows up
Précédent
Suivant
Répondre
Fil
Voir

Click here to load this message in the networking platform