Google-like queries in FoxPro apps

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

Google-like queries in FoxPro apps

Message

From

01/05/2002 00:10:12

Steve Craft
Atomic9.NET
Philadelphia, Pennsylvania, United States

23/02/2002 18:18:35

Rick Graves
Advanced Approach Ltd.
Hong Kong, Hong Kong

General information

Forum:

Visual FoxPro

Category:

Databases,Tables, Views, Indexing and SQL syntax

Title:

Re: Google-like queries in FoxPro apps

Miscellaneous

Thread ID:

00624182

Message ID:

00651192

Views:

I am nowhere near a perfect solution yet, but based on what little I have learned/observed from MS Indexing Service and PHDBase and MSSQL2K, here's my approach for VFP.

I have a "main" table with a unique row ID, this has my actual data. I then create a "keyword" table with a unique row ID, and a table with "noise words".

I scan the main table, and read through my column with actual content, and use the noise words table to strip out things that will throw off my searching - this includes things like singular "a", "the", "that", etc. I also convert the entire string to uppercase. I then insert the leftover string into the keyword table; I then strip off the first word, put it on the end of the string, and insert that again. There will be n entries in the keyword table for every single entry in the main table. So the tables would look something like this:

main.dbf
   iRowID     cText
   1          Universal Thread Messages
   2          Steve Craft
keywords.dbf
   iRowID     cText
   1          Universal Thread Messages
   1          Thread Messages Universal
   1          Messages Universal Thread
   2          Steve Craft
   2          Craft Steve

I have an index on both columns in both tables.

If the search phrase is "universal messages", I take the first word, convert it to uppercase (to match the index expresion in the keywords.dbf), wash it through the noisewords table, and SQL-Select the top 500 rows into an array. I also build a "like" string for my final SQL statement (more on that in a minute). So in the end I will have n arrays (I have a limit of 10 keywords in this KWIC style search) of no more than 500 elements each. I then do the same for every other word.

I then create a simple 1-column cursor of iRowID, and append from each array into the cursor. I finally create a dynamic SQL string and use the "like" string I built earlier (eg. cWhereLike = cWhereLike + 'cText Like "%'+ cCurrWord+'%" and '), and again include a Top 500 clause and a Group By clause on the iRowID. In the end, I end up with a very small set of iRowID values that I can point back at the original main.dbf.

My thought is that Rushmore optimizes from left-to-right only, and does best with pulling small result sets from large ones. So in my first SQL queries where I am making arrays, I am reading from left-to-right and only keeping 500 rows max and I'll never have more than 5,000 total - in my tests I have always had much less. Rushmore won't optimize the cWhereLike clause in my final query, but because I have a very small amount of rows by that point it hasn't mattered so far.

My testing so far has included words in a char column, and not a memo column. The only difference between doing a char and a memo would really be the size of the keyword table on-disk, since the results of each query are limited as soon as possible. Wrapping the above methodology in a datetime() shows that everything is occuring in the same second, which is fast enough for me. :)

There's some more code in between what I wrote to make it a little smoother, but anyone who has $.02, please jump in.

>Hello world,
>
>How do you implement a Google-like search utility in a FoxPro application? I assume someone has figured this out already.

Map

View

Click here to load this message in the networking platform