Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Speeding up like queries
Message
From
26/11/2006 15:37:13
Dragan Nedeljkovich (Online)
Now officially retired
Zrenjanin, Serbia
 
 
To
26/11/2006 09:45:37
Yh Yau
Ingenuity Microsystems Sdn Bhd
Kuala Lumpur, Malaysia
General information
Forum:
Visual FoxPro
Category:
Coding, syntax & commands
Environment versions
Visual FoxPro:
VFP 6 SP5
OS:
Windows XP SP2
Network:
Windows 2003 Server
Database:
Visual FoxPro
Miscellaneous
Thread ID:
01172471
Message ID:
01172509
Views:
13
>Hi all,
>I'm using a like "%variable_name%" condition in my SQL query for a large table (>1 million records) and it takes a long time for retrieval.
>The field is a character field and is already indexed.
>Any strategies that can be used to speed up this query?

There is a huge difference between a like "xxxx%" and a like "%xxxx%" query. The former works just like the regular = query, i.e. it returns only key values beginning with xxxx, which are already grouped and probably in a single b-tree node. The latter, however, can expect to find xxxx anywhere, so VFP has to traverse the whole tag to check for any occurrence of xxxx in any key value anywhere. It's just not Rushmore optimizable then.

My solution to this (and for queries into long texts in memo fields) was to have two additional tables - one with words, and a link table with WordPK, SourceRecordPk, Relevance. The relevance field is calculated from the length of the word, its position in the text, length of the text - longer words score more, closer to start scores more than closer to end, longer texts score more than shorter. I'd run some code to fill these tables each time a record in the source table changes.

Then upon search (also on search on multiple words), I'd pick up the WordPkeys of the words I was looking for, and do a join on source table via the link table. I got a speed gain of about 20 times on the average, compared to search on a$b and b like "%a%". I'm using it on whole words only, but I figure it'd be pretty much at the same speed if I used the like "xxx%" query. The relevance field comes into play only at sorting the results - the records with the highest total relevance (again, more for the first word, less for second etc) come on top.

back to same old

the first online autobiography, unfinished by design
What, me reckless? I'm full of recks!
Balkans, eh? Count them.
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform