Duplicates - Level Extreme

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

Duplicates

Message

18/05/1999 15:22:27

Raymond Humphrys
Michigan Department of Community Health
Bath, Michigan, États-Unis

18/05/1999 15:17:27

Don Rapp
DVR Software Solutions, LLC.
Annandale, Virginie, États-Unis

Information générale

Forum:

Visual FoxPro

Catégorie:

Base de données, Tables, Vues, Index et syntaxe SQL

Titre:

Re: Duplicates

Divers

Thread ID:

00220129

Message ID:

00220179

Vues:

That's exactly what I do next. Run several SQL cmds from very strict to very loose matching criteria, adding a key to indicate which match step the record matched at. Sort the output from Strict to loose criteria and decide where you need human intervention.

>>Standardize all the Geocode stuff to fixed fields in a standard format; Apartment to Apt, Lane to LN, Street to ST. Break the zip code into two pieces. Then do your record linkage.
>>
>>
>>>I inherited a project from someone who wrote a long convoluted program to look for duplicates in a table. I'm wondering if it could be done easier using SQL. Here's an example of the problem we have frequently:
>>>
>>>Record 1
>>>------------------------
>>>JOHN
>>>L
>>>SMITH
>>>1234 ANYWHERE LANE
>>>HERESVILLE
>>>PA
>>>16555
>>>------------------------
>>>
>>>Record 2
>>>------------------------
>>>JOHN
>>>
>>>SMITH
>>>1234 ANYWHERE LN
>>>HERESVILLE
>>>PA
>>>16555-2345
>>>------------------------
>>>
>>>Records 1 and 2 are the same person, but a SELECT DISTINCT does not eliminate one of them because the middle initial, address line, and zip code are not identical. My question is this, is there a way with SQL to identify these two records, which are in the same file, as "possible" duplicates which I could then display to a user for them to decide what to do with them?
>>>
>>>Thanks.
>
>Or go a partial combination on first portions of critical fields then an eyeball approach. (you can still try to pre-clean the data as Raymond mentioned.
>
>

>select ;
>    upper( left( lastname, 10 ) +;
>         left( firstname, 10 )+;
>         left( address1, 20 ) +;
>         left( city, 10 ) +;
>         left( state, 2 )), count(*);
>  from SrcTable;
>  group by 1;
>  having count(*) > 1;
>  into cursor PossibleDups
>

Some days it's not worth chewing through the leather straps ...

Répondre

Fil

Voir

Click here to load this message in the networking platform