Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
Dotnetpro database performance contest
Message
De
14/05/2007 14:05:26
 
Information générale
Forum:
Visual FoxPro
Catégorie:
VFP Compiler for .NET
Divers
Thread ID:
01224231
Message ID:
01225314
Vues:
18
Hi Samuel,

>Even several bytes (16 bytes) Hash Functions does not warrant collision free results, but they indeed make the collision detection time lower.

I'ld say that the size of the return value should mainly be based on the no. of records needed to compare. If the quality of the hash fucntion is high it will distribute all records evenly across the number of bins characterized by the possible key values.

>The right approach should be one based in the record length, so the number of bytes of the hash function is dependant of the record length in order to keep the expected falses positives values small and at the same time the hash funciton not to be so time consuming. Someone could come with a length based formula that gives the most optimum hash length based in the time used to compute them and the savings in comparisons of false positives.

I think the hash functions for key distribution across bins are slightly differetn from the hash functions used in cryptography, but have not followed the issue.

>
>By the way if the hash function also is proportional to the byte values, i.e. if the hash value is lower for values as "abc" and greater for values "mbc", then it can be used to speed up the sorting and then just sort the few collisions. It is faster to sort just hashcodes and then perform the lenghty byte comparisons.

I think this would automatically mean "bad hash function" as the distribution across the bins would be defined by the first bytes. The idea itself is valid, but should be described as a order-keeping compression, which is different from the "fingerprint"-hash idea.

>So this is another application for sorting, and I can see others as grouping.
Yupp, this was one strategy mainly planned for select distinct and grouping large fields in the SQL interpreter I wrote back in the early 80ies, as I did not know the fox back then.

regards

thomas
Précédent
Répondre
Fil
Voir

Click here to load this message in the networking platform