What's wrong with CRC32 - Level Extreme

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

What's wrong with CRC32

Message

From

09/01/2014 22:12:10

Thomas Ganss
Main Trend
Frankfurt, Germany

09/01/2014 16:39:46

Al Doman
M3 Enterprises Inc.
North Vancouver, British Columbia, Canada

General information

Forum:

ASP.NET

Category:

Other

Title:

Re: What's wrong with CRC32

Environment versions

Environment:

VB 9.0

OS:

Windows 7

Network:

Windows 2003 Server

Database:

MS SQL Server

Application:

Web

Miscellaneous

Thread ID:

01591437

Message ID:

01591612

Views:

>>>As you do run significant numbers (50 mill) be sure to think about hash collisions - ALL hash functions, CRC included, have information loss which might result in false positives. Very early in computing (disc space being VERY costly then) I had built a structure identifying duplicates via 3 different hashes taken together to form the key and even then check for excact duplication and increment trailing integer in case of collisions...
>>
>>You could explain more about "information loss" and "false positives"?
>>
>>It seemed that the hash method was a more sufficient method to obtain an ID on the file. Are you saying that doing it twice might result in a different value sometimes?
>
>All you need to do is process the contents of your files once, with a suitable cryptographic hash function, and store the resulting digests: http://en.wikipedia.org/wiki/Cryptographic_hash_function
>
>If you don't need high security MD5 is fine. SHA is better. Even SHA-1 with a 160 bit digest has 2^160 possible digest values, which is over 10^48. The chances of collisions in 50 million (5x 10^7) files is vanishingly small; if you actually got one you should notify the crypto community (not kidding).

Agreed that 2**160 offes much better odds at encountering collisions than 2**32 ;-) In my case the hash functions used at the tie were bound by CPU power of 286 and 6800 IIRC - the need for crypto had not surfaced to general public. And collision odds of 5**10 events in 2**32 space is not that astronomical.

Map

View

Click here to load this message in the networking platform