>Hi everyone,
>
>I have an application called De-Duper, which identifies duplicates in a table based on one field or combination of few fields. Once duplicates are identified, user can review them, combine, delete, edit, etc.
>
>My colleague, who was testing this application, made a suggestion: this application should identify complete duplicates (e.g. almost all fields should be identical), then delete all duplicate records except one.
>
>I have one solution in mind, but not sure, it's the best. Besides, it would not be Rushmore optimizable, so may take a long time.
>
>Ok, here is what I have in mind:
>
>Table has a RecNum Identifier field.
>
>1) Create a list of all fields, which should be checked (comma-delimited) in a string. This part is already done.
>
>2)
lcStr='select *, count(*) as cnt_dups from BldMstr group by '+m.lcFieldList+ ;
> ' having cnt_dups>1 into cursor curCompleteDups'
> strtofile(m.lcStr, m.lcTempPrg)
> compile (m.lcTempPrg)
> do (m.lcTempPrg)
> delete (m.lcTempPrg)
>
>3) create similar string
> m.lcDelStr='field1=curCompleteDups.field1 and field2=curCompleteDups.field2"...
> lcStr='delete for '+m.lcDelStr+' and RecNum<>curCompleteDups.RecNum'
> create a temp program and execute it.
>
> Do you think, it's feassible? Or can you suggest a better approach?
>
> Thanks in advance.
To delete whole compelete duplicate record,
fast way is to SELECT ... group by field1, field2... fieldLast into table xxx
The weak wait for chance, The strong bid for chance,
The clever notch up chance, but The merciful give you chance.