Removing double records, math problem

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

Removing double records, math problem

Message

15/06/2007 15:46:13

Dragan Nedeljkovich
Now officially retired
Zrenjanin, Serbia

15/06/2007 15:15:32

Lennert Blom
Hoevelaken, Pays-Bas

Information générale

Forum:

Visual FoxPro

Catégorie:

Base de données, Tables, Vues, Index et syntaxe SQL

Titre:

Re: Removing double records, math problem

Versions des environnements

Visual FoxPro:

VFP 9 SP1

OS:

Windows XP SP2

Network:

Windows 2003 Server

Database:

Visual FoxPro

Divers

Thread ID:

01233421

Message ID:

01233642

Vues:

>>That's a good idea. It makes sense to start from the table with the minimum number of required deletions. As I mentioned before, exact result, i.e. absolutely correct proportion is not reachable, so one should try to get the best randomization.
>
>I have the feeling there is a mathematically best solution. Because the way records will be deleted affects real business money, and because a solution like yours we programmed left us with differences from the optimal solution in the range of 3-7%, I'm afraid our customers will not be satisfied.
>
>My experience says I have to look in the direction of a recursive algorithm. That's fine, but my experience won't come with that algorithm. It reminds me slightly of the problem of 8 queens on a chess board, relatively easy to solve with recursion.

A recursive algorithm would go from one to the next, or from all to one, either adding one or subtracting one level at each step. I think Ed's solution (with some randomization) would do the trick.

So, in practical terms, you need a few cursors:

C1 counters per table, one record per table, to hold the number of candidates, number of records to delete and number of records to keep.
C2 with the keys of the duplicates, the table they come from, status (done deleting or not)
and one cursor per original table, with duplicates only.

Going through C2, you first look for something to keep from table1. If you've found it, delete it, mark as done in C2 for the whole set, decrease the numbers in C1 (for the tables from which it will be deleted) and increase the counter for kept records (for the table from which it will be accepted). Then go through C2 again, looking for a deletable record from table2 etc. Any table whose counter reaches a max (of records to keep or records to delete) is automatically triggering a "do the rest of the records for this table no matter what". See whether this gets you into a blind alley or not ;).

back to same old
the first online autobiography, unfinished by design
What, me reckless? I'm full of recks!
Balkans, eh? Count them.

Répondre

Fil

Voir

Click here to load this message in the networking platform