General information
Category:
Databases,Tables, Views, Indexing and SQL syntax
If you need to remove the entirely identical records then what is wrong with :
select * dist into newtable
then rename and reindex?
<<<<<<<<<<<<<<< your message below >>>>>>>>>
Hi everyone,
I have an application called De-Duper, which identifies duplicates in a table based on one field or combination of few fields. Once duplicates are identified, user can review them, combine, delete, edit, etc.
My colleague, who was testing this application, made a suggestion: this application should identify complete duplicates (e.g. almost all fields should be identical), then delete all duplicate records except one.
I have one solution in mind, but not sure, it's the best. Besides, it would not be Rushmore optimizable, so may take a long time.
Ok, here is what I have in mind:
Table has a RecNum Identifier field.
1) Create a list of all fields, which should be checked (comma-delimited) in a string. This part is already done.
2)
lcStr='select *, count(*) as cnt_dups from BldMstr group by '+m.lcFieldList+ ;
' having cnt_dups>1 into cursor curCompleteDups'
strtofile(m.lcStr, m.lcTempPrg)
compile (m.lcTempPrg)
do (m.lcTempPrg)
delete (m.lcTempPrg)
3) create similar string
m.lcDelStr='field1=curCompleteDups.field1 and field2=curCompleteDups.field2"...
lcStr='delete for '+m.lcDelStr+' and RecNum<>curCompleteDups.RecNum'
create a temp program and execute it.
Do you think, it's feassible? Or can you suggest a better approach?
Thanks in advance.
Previous
Next
Reply
View the map of this thread
View the map of this thread starting from this message only
View all messages of this thread
View all messages of this thread starting from this message only