Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Eliminating duplicates
Message
 
 
To
All
General information
Forum:
Visual FoxPro
Category:
Databases,Tables, Views, Indexing and SQL syntax
Title:
Eliminating duplicates
Miscellaneous
Thread ID:
00516123
Message ID:
00516123
Views:
53
Hi everyone,

I have an application called De-Duper, which identifies duplicates in a table based on one field or combination of few fields. Once duplicates are identified, user can review them, combine, delete, edit, etc.

My colleague, who was testing this application, made a suggestion: this application should identify complete duplicates (e.g. almost all fields should be identical), then delete all duplicate records except one.

I have one solution in mind, but not sure, it's the best. Besides, it would not be Rushmore optimizable, so may take a long time.

Ok, here is what I have in mind:

Table has a RecNum Identifier field.

1) Create a list of all fields, which should be checked (comma-delimited) in a string. This part is already done.

2)
lcStr='select *, count(*) as cnt_dups from BldMstr group by '+m.lcFieldList+ ; 
    ' having cnt_dups>1 into cursor curCompleteDups'
    strtofile(m.lcStr, m.lcTempPrg)
    compile (m.lcTempPrg)
    do (m.lcTempPrg)
    delete (m.lcTempPrg)

3)  create similar string 
    m.lcDelStr='field1=curCompleteDups.field1 and field2=curCompleteDups.field2"...
    lcStr='delete for '+m.lcDelStr+' and RecNum<>curCompleteDups.RecNum'  
    create a temp program and execute it.
Do you think, it's feassible? Or can you suggest a better approach?

Thanks in advance.
If it's not broken, fix it until it is.


My Blog
Next
Reply
Map
View

Click here to load this message in the networking platform