Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
The exact phrase needs to be found
Message
From
18/12/2010 16:54:47
Dragan Nedeljkovich (Online)
Now officially retired
Zrenjanin, Serbia
 
 
General information
Forum:
Level Extreme
Category:
Other
Miscellaneous
Thread ID:
01385793
Message ID:
01493250
Views:
56
>>I once did that on dbfs, indexing for most of the words (omitted a few - articles, short lowercase words etc), on a fpt of about 190 megs, and the size of the index table was 59M dbf + 64M cdx, and it would give the result set blazingly fast. Then searching for exact phrase in the result set, using simple atc() on the phrase was easy and just as fast.
>
>There must be a tremendous percentage of words that occurs multi fold. A reduction from 190 to 59 is less than I'd expect. How come?

Not your ordinary language. It was all in legalese. And, btw, that was the word-to-text links table; the words table itself was about 6M. There were some additional fields in the links table, like the position of the first appearance of the word in text and maybe one more, used later in sorting results by relevance.

back to same old

the first online autobiography, unfinished by design
What, me reckless? I'm full of recks!
Balkans, eh? Count them.
Previous
Reply
Map
View

Click here to load this message in the networking platform