Problem - Level Extreme

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

Problem

Message

24/07/2000 09:05:43

Ed Rauh
Trumbull, Connecticut, États-Unis

24/07/2000 08:53:57

Sandy Schechter
Axium International
New York City, New York, États-Unis

Information générale

Forum:

Visual FoxPro

Catégorie:

Base de données, Tables, Vues, Index et syntaxe SQL

Titre:

Re: Problem

Divers

Thread ID:

00396098

Message ID:

00396101

Vues:

>I've been posed with a problem to solve and I would like to get some input on the solution - any ideas would be greatlu appreciated.
>
>Given the following sample table:
>
>ForiegnKey DataField
>========== ==========
>1 1
>1 2
>1 3
>2 2
>2 3
>2 4
>3 2
>3 4
>
>We assume that the Foreign Key refers to a table of customers, so the problem will refer to "customers,"
>and that the DataField is a product number. We wish to determine the number of "crossovers"
>between any particular value of the data field. In other words, how many customer bought both products 2
>and 4?. In this case, customers 2 and 3 each have records where the data field is 2 and 4, so the answer
>is two.
>

If the number of products is small enough so that it'd be legal to construct a table with each column representing a product and each row a customer, generating a cross-tab, if necessary using the GENXTAB.PRG that comes with older versions of VFP, or the pivot table wizard, could be used; you'd then pose your query against the resulting crosstab, selecting the customers with non-zero values for the products under consideration.

>The conventional approach is to do a SQL self-join, something like the following:
>
>SELE a.DataField DF1,b.DataField df2,COUNT(*) as "TheCount";
> FROM TheTable a JOIN TheTable b ON a.ForeignKey = b.ForeignKey;
> GROUP BY df1,df2
>
>Which yields:
>
>DF1 DF2 TheCount
>=== === ==========
>1 1 1
>1 2 1
>1 3 1
>2 1 1
>2 2 3
>2 3 2
>2 4 2
>3 1 1
>3 2 2
>3 3 2
>3 4 1
>4 2 2
>4 3 1
>4 4 2
>
>To further explain, we can now look at the fifth row and say, "3 customer bought product 2," or look at row
>6 and say "two customers who bought product 2 also bought product 3," or look at row three and say, "only
>one customer who bought product 1 also bought product 3."
>
>The problem is that, when the number of records in the table gets large, this can take days, even if it does
>not crash out Fox or NT.
>
>OBJECTIVE
>Find a way to get the same result as the SQL self-join with linear performance based on the number of records
>in the source table.
>
>ASSUMPTIONS
>1. The table is already sorted on Foreign Key + Data Field
>2. The values for Data Field are known in advance and are stored in another table, usually not more than
>100-200 values
>3. The table can have many millions of records, so creating interim tables by breaking up the SQL into
>two passes risks the 2G limit.
>4. For the doubters, yes the table is indexed and the self-join still takes > 24 hours when indexed.
>
>Thanks

EMail: EdR@edrauh.com
"See, the sun is going down..."
"No, the horizon is moving up!"
- Firesign Theater

NT and Win2K FAQ .. cWashington WSH/ADSI/WMI site
MS WSH site ........... WSH FAQ Site
Wrox Press .............. Win32 Scripting Journal
eSolutions Services, LLC

The Surgeon General has determined that prolonged exposure to the Windows Script Host may be addictive to laboratory mice and codemonkeys

Répondre

Fil

Voir

Click here to load this message in the networking platform