Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Eliminating dups using multiple tables
Message
From
11/06/2006 12:52:09
 
General information
Forum:
Visual FoxPro
Category:
Databases,Tables, Views, Indexing and SQL syntax
Environment versions
Visual FoxPro:
VFP 9 SP1
OS:
Windows XP
Database:
Visual FoxPro
Miscellaneous
Thread ID:
01128331
Message ID:
01128337
Views:
23
>Hi everybody,
>
>I wrote the following query
>
>select count(*) as cntDups, ;
>Patients.dDOB, Patients.cSSN, Patients.cSex_code, ;
>Names.cf_Name, Names.cl_Name, Names.cM_Initial, ;
>Address.cStreet1, Address.cCity, ;
>nvl(cAREA_CODE,space(3)) as cArea_Code, ;
>nvl(cEXCHANGE,space(3)) as cExchange, ;
>nvl(cLAST_FOUR, space(4)) as cLast_Four ;
>from Patients inner join Names ;
>on Patients.cPatients_pk = Names.cPointer_fk ;
>inner join Address ;
>on Patients.cPatients_pk = Address.cPointer_fk ;
>left join Phones ;
>on Patients.cPatients_pk = Phones.cPointer_fk ;
>group by 2,3,4,5,6,7,8,9, 10, 11, 12 ;			
>having cntDups > 1 ;
>order by 6,5 ;
>into cursor curDupPatients
>
>which produced 80K records (without Phone part there were 21K records).
>
>Now I want to to somehow de-dup these patients. First of all, what would be a logical assumption to de-dup? Leave the Patients, that have more transaction records associated with them? Or leave the patients with the newest cEntered_Date? Say, I want to use the second criterion. How can I identify these Patients? May be I can change my original query to include cPatients_pk? But how would I use the latest cEntered_Date?
>
>Once I select the master patients_pk, I still need all other patients PK, so I can re-point transaction records to the master PK. I can delete records in Address, Names, Phones, that point to other PKs after I re-point Transaction records.
>
>Could you please suggest some ideas? Should I try to scan this curDupPatients and try to populate cPatients_pk cursor?
>
>Thanks in advance.
___________________________________
Hi Nadya,

The most logical to me is to de-dup on Social Security Number (and I assume that no two persons can have the same)

Once you have identified those you can alter the ParentKey of all levels just below your patient table (say this is level 2). If the PatientPk is on levels 3 or more, a trigger with cascaded update may help

The obsolete Patients could be deleted then

Just a thought


*** make a good backup first ***
Gregory
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform