Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Updating records in FoxPro tables
Message
 
General information
Forum:
Visual FoxPro
Category:
Databases,Tables, Views, Indexing and SQL syntax
Miscellaneous
Thread ID:
00565010
Message ID:
00565321
Views:
16
>>>< SNIP >
>>>Nadya,
>>>
>>>I think you should write FAQ 'How to delete duplicate records'. Include all four ideas: Message #565099, Message #565029, Message #565097 and Message #565101 with explanation of there Pros and Cons.
>>
>>Ok, I'll do. BTW, another method would be:
>>
>>select distinct * from originaltable into myNewTable
>>delete originalTable
>>rename file NewTable to OriginalTable, if we're talking about free tables
>>or zap original and append.
>
>It'll only work for exact duplicates (all fields). Using GROUP BY instead of DISTINCT can fix it.
>
>>However, I believe, the algorithm, I finally described with your corrections is the optimum here.
>
>You'll have to do some bench marking to prove it :)
>
>I still think that all of them can be useful.

Hi Sergey,

I finally submitted a FAQ (after third attempt). Here is my test program and results. According to my tests, the "Unique" approach is the fastest one. However, it might be related to OS, CPU, etc.
********************************************************************
*  Description.......: TestDups - this program tests 4 different methods to eliminate duplicates from a table
*  Calling Samples...: 
*  Parameter List....: 
*  Created by........:  Nadya Nosonovsky 10/07/01 07:17:40 PM
*  Modified by.......: 
********************************************************************
clear
create table TestDups (KeyFld I, NameFld C(60), DateFld D)
index on KeyFld tag KeyFld
set order to
local I, KeyFld I, NameFld, DateFld, lnLower, lnUpper, lnSeconds, lKeyField, lnKeyID
lnLower = 1
lnUpper = 500

for I=1 to 200000
     DateFld = date()-mod(I,7)
     NameFld="Test"+transform(I)
     KeyFld = int((m.lnUpper - m.lnLower + 1) * rand( ) + m.lnLower)
     insert into TestDups from memvar
next

*browse
******* Method one - Ron Darling

*!*     - The table cannot have deleted records before this operation
*!*     - It requires to delete all records in the table that could be slow on the big table
*!*     - It requires creating a new index and on the big table it could take a long time
*!*     - It requires to recall almost all records in the table that could be time consuming on the big table also.
clear all
close all
lnSeconds = seconds()
set deleted off
use TestDups exclusive
delete all
index on KeyFld tag UniqueKey unique
recall all
set order to
?"Method 1:",seconds()-m.lnSeconds
recall all && First condition
set deleted on
*brow
************* Method two -  Sergey Berezniker
clear all
close all
lnSeconds = seconds()
use TestDups
set order to KeyFld
lKeyField = null
do while!eof()
     if m.lKeyField= KeyFld
          delete
     else
          lKeyField= KeyFld
     endif
     skip
enddo
?"Method 2:",seconds()-m.lnSeconds
recall all && First condition
******** Method three
clear all
close all
lnSeconds = seconds()
select KeyFld, count(*) as CntDups from TestDups group by 1 ;
     having CntDups > 1 ;
     into cursor curDupes
select TestDups
set order to KeyFld
select curDupes     
scan && curDupes
     lnKeyID = KeyFld
     select TestDups
     seek m.lnKeyID
     skip   && leave the first record
     scan while KeyFld = m.lnKeyID
          delete
     endscan
endscan
?"Method 3:",seconds()-m.lnSeconds
select TestDups
recall all && First condition
*********Method four
clear all
close all
lnSeconds = seconds()
select * from TestDups group by KeyFld into cursor curNoDupes nofilter
use in TestDups
use TestDups excl in 0
select TestDups
zap
append from (dbf('curNoDupes'))
?"Method 4:",seconds()-m.lnSeconds
For 50000 records the results are:
Method 1: 0.423
Method 2: 1.556
Method 3: 0.904
Method 4: 0.531

For 100000 records the results are:
Method 1: 0.800
Method 2: 3.707
Method 3: 1.803
Method 4: 2.277

For 200000 records the results are:
Method 1: 2.094
Method 2: 100.774 && Very strange
Method 3: 6.592
Method 4: 4.155
If it's not broken, fix it until it is.


My Blog
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform