>Thanks buddy
>
>In fact, I thought of this approach earlier, but can we be "100%" sure that checksum value returned by the function is always relied on and will always be a one to one mapping with its larger counterpart (text).
>
>Can you explain the checksum story in VFP - or can you send me a link which explains the checksum in detail?
>
>Thanks a lot.
>Vikas
Vikas,
Checksum is a hashing algorithm. It's very simple that many other algorithms like MD1,...MD5, SHA, HMAC etc are developed for high security services. CryptoAPI section has all descriptions, sample code etc. On MSDN check for title "The Cryptography API, or How to Keep a Secret". Searching for 'hash' brings a very long set with some C sample codes.
This is from
http://www.math.utah.edu/~beebe/software/filehdr/node15.html :
"Consequently, a lot of research has been done on algorithms for finding checksums, and some have even achieved international standardization. One of these standard algorithms is known as a CRC-16 checksum. CRC stands for cyclic redundancy checksum ,¸cyclic redundancy checksum¸checksumcyclic redundancy and the redundancy of following it with the word checksum is accepted practice. The CRC-16 checksum¸checksumCRC-16 is capable of detecting error bursts up to 16 bits, and 99 percent of bursts greater than 16 bits in length. The checksum number is represented as a 16-bit unsigned number, encompassing the range 0 ... 65535. Thus, there is roughly one chance in 65535 of an error not being detected, that is, of two different files having the same checksum."
and this is from
http://mathworld.wolfram.com/CyclicRedundancyCheck.html :
"The method is not infallible since for an N-bit checksum, of random blocks will have the same checksum for inequivalent data blocks. However, if N is large, the probability that two inequivalent blocks have the same CRC can be made very small."
Also there are other fast algorithms like Adler32 creating 32bits checksums.
You might also want to check these :
www.4d.com/ACIDOC/CMU/CMU79786.HTMwww.topshareware.com/details.asp?556www.dataman.com/files/S4/checksum.htmwww.zvon.org/tmRFC/RFC2960/Output/chapter19.htmlwww.relisoft.com/Science/CrcNaive.htmlpascal.sources.ru/crc/pascrc32.htm
Even thinking 2 different char streams might produce same checksum you could optimize your search putting the indexed one in front. ie:
... where sys(2007,field) == sys(2007,lcSearch) and field == lcSearch ...
Reading your purpose on other branch I don't think you need to index such a long text. Assign generated integer keys for real productID and assume user productIDs as kind of description. Products wouldn't be too much on any system and even without an index that table could be searhed fast. Combining with the above idea you can have an indirect index for it (collisions if ever occur still would pull out few to check further).
Cetin