Normalization, non-stuctural CDX and data buffering

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

Normalization, non-stuctural CDX and data buffering

Message

From

12/01/2002 20:30:18

Rick Graves
Advanced Approach Ltd.
Hong Kong, Hong Kong

All

General information

Forum:

Visual FoxPro

Category:

Databases,Tables, Views, Indexing and SQL syntax

Title:

Normalization, non-stuctural CDX and data buffering

Miscellaneous

Thread ID:

00604007

Message ID:

00604007

Views:

Hello world,

I have run into a problem. To solve the problem, my next step would be to add a "duplicate field", thus violating one of the normalization rules. Before I do this, I would appreciate a 2nd (3rd, 4th, etc) opinion.

I am developing an application that tracks the sale of collectables on ebay. (Later, I may expand the program to include auctions on other sites.) So I have a table of items, which correspond to "items" (ebay term) put up for auction.

Obviously, I am not trying to track everything. Rather, I have a "list" (speaking generally) of things in which I am interested. I have a table of categories, a table of brands/manufacturers, and a table of model numbers/names. The model number/names table has fields for a category key and a brand/manufacturer key. Each model record must have a value in the category key and may or may not have a value in the brand/manufacturer key. (Most model numbers were supplied by by a single brand or manufacturer, but some were supplied by many brands/manufacturers.)

The term "category" above does NOT correspond to the "categories" in ebay. So in my application, I use a different word, "type". My type table has about 100 records, while all the stuff I am tracking can be found in less than 10 ebay caterories.

So much for symantics.

In the ebay categories which I am persuing, most items are not of interest to me, and those records have been recycled. When my program finds a match between the ebay item title and something on my "list" of things that I am tracking, I call that a "hit". I have been collecting data for over one year, so I have well over 100,000 records in the items database. These are the items which I have chosen to keep.

One "item" for sale on ebay may contain one and only one model number, or there could be several. So there is a separate table of "item hits" which has fields for the item key, type key, brand/manufacturer key, and model key. Obviously, there is a one-to-many relationship between items and item hits (although many items will have only one associated item hit record). So there are more records in the item hits table than in the items table.

To access the data, one would want to choose a type, brand/manufacturer and/or model number, and then browse the items that have been auctioned meeting the search criteria. From the user's standpoint, instant results are good. Waiting a resonable time (a moment) is OK.

With this large database, I have not been able to get any SQL based query or view to give results within an acceptable waiting time. I may not know how to use SQL well enough. But I understand that SQL must search the entire table, and does not use Rushmore to give quick results.

Rushmore works fast enough. I can set a filter on the item hits table and access the matching items.

Here is the rub: I want the list in chronological order, most recent on top. "Chronological" refers to the auction ending date/time. That is a field in the item table, but I am setting a filter on the item hits table.

I have experimented with using a non-structural CDX, indexing the item hits table on the related auction end date/time field in item table. This works, and works fast enough as long as I only want to display the list.

One big part of my program "reads" the ebay auction title (the one-line description) and compares words in the title to my "list" of things that I am tracking (tables for types, brands/manufacturers and model numbers/names). This is an automated process, employing techniques which I would loosely call "artificial intellegence" and "fuzzy logic". Sometimes the program gets it exactly right, sometimes not. So when I get the information for an item on the screen, I may want to fix (edit) the item hits.

Foxpro will not let me make most changes when the non-structural CDX is open. And Foxpro will not allow opening/closing CDX files while data buffering is on.

This is where I am "stuck".

So I have considered adding an auction end date/time field to the item hits table. This would be a repeat of the field in the items table -- NOT NORMALIZED!

What would YOU do?

Thanks,

Rick

Map

View

Click here to load this message in the networking platform