Plateforme Level Extreme
Abonnement
Profil corporatif
Produits & Services
Support
Légal
English
The exact phrase needs to be found
Message
De
18/12/2010 16:54:47
Dragan Nedeljkovich (En ligne)
Now officially retired
Zrenjanin, Serbia
 
 
À
18/12/2010 15:23:12
Information générale
Forum:
Level Extreme
Catégorie:
Autre
Divers
Thread ID:
01385793
Message ID:
01493250
Vues:
57
>>I once did that on dbfs, indexing for most of the words (omitted a few - articles, short lowercase words etc), on a fpt of about 190 megs, and the size of the index table was 59M dbf + 64M cdx, and it would give the result set blazingly fast. Then searching for exact phrase in the result set, using simple atc() on the phrase was easy and just as fast.
>
>There must be a tremendous percentage of words that occurs multi fold. A reduction from 190 to 59 is less than I'd expect. How come?

Not your ordinary language. It was all in legalese. And, btw, that was the word-to-text links table; the words table itself was about 6M. There were some additional fields in the links table, like the position of the first appearance of the word in text and maybe one more, used later in sorting results by relevance.

back to same old

the first online autobiography, unfinished by design
What, me reckless? I'm full of recks!
Balkans, eh? Count them.
Précédent
Répondre
Fil
Voir

Click here to load this message in the networking platform