Speeding up like queries

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

Speeding up like queries

Message

From

26/11/2006 15:37:13

Dragan Nedeljkovich (Online)
Now officially retired
Zrenjanin, Serbia

26/11/2006 09:45:37

Yh Yau
Ingenuity Microsystems Sdn Bhd
Kuala Lumpur, Malaysia

General information

Forum:

Visual FoxPro

Category:

Coding, syntax & commands

Title:

Re: Speeding up like queries

Environment versions

Visual FoxPro:

VFP 6 SP5

OS:

Windows XP SP2

Network:

Windows 2003 Server

Database:

Visual FoxPro

Miscellaneous

Thread ID:

01172471

Message ID:

01172509

Views:

>Hi all,
>I'm using a like "%variable_name%" condition in my SQL query for a large table (>1 million records) and it takes a long time for retrieval.
>The field is a character field and is already indexed.
>Any strategies that can be used to speed up this query?

There is a huge difference between a like "xxxx%" and a like "%xxxx%" query. The former works just like the regular = query, i.e. it returns only key values beginning with xxxx, which are already grouped and probably in a single b-tree node. The latter, however, can expect to find xxxx anywhere, so VFP has to traverse the whole tag to check for any occurrence of xxxx in any key value anywhere. It's just not Rushmore optimizable then.

My solution to this (and for queries into long texts in memo fields) was to have two additional tables - one with words, and a link table with WordPK, SourceRecordPk, Relevance. The relevance field is calculated from the length of the word, its position in the text, length of the text - longer words score more, closer to start scores more than closer to end, longer texts score more than shorter. I'd run some code to fill these tables each time a record in the source table changes.

Then upon search (also on search on multiple words), I'd pick up the WordPkeys of the words I was looking for, and do a join on source table via the link table. I got a speed gain of about 20 times on the average, compared to search on a$b and b like "%a%". I'm using it on whole words only, but I figure it'd be pretty much at the same speed if I used the like "xxx%" query. The relevance field comes into play only at sorting the results - the records with the highest total relevance (again, more for the first word, less for second etc) come on top.

back to same old
the first online autobiography, unfinished by design
What, me reckless? I'm full of recks!
Balkans, eh? Count them.

Map

View

Click here to load this message in the networking platform