Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
How do I find duplicate records?
Message
 
General information
Forum:
Visual FoxPro
Category:
Databases,Tables, Views, Indexing and SQL syntax
Miscellaneous
Thread ID:
00431045
Message ID:
00431652
Views:
9
Hi Vlad,

This discussion gives me this idea:

1) select upper(...+...) as Situs from myTable group by 1 where between(date, mdate-10, mdate+10) having count(*)>1 into cursor curDups

update myTable set flag='D' where upper(...) in (select Situs from curDups) and between(date, mdate-10, mdate+10)

Seems to work (at least written here :)), though I have to check it.

Thanks a lot for your support.

>Hi Vlad,
>
>Once again: this update would not do what I want. Can you see? :)
>1) You select correct duplicate addresses in your subquery.
>Not let's imagine this situation: Address1 is the duplicate by my definition (it's duplicate and date within 7)
>
>Record #  Address  Date
>1         Address1 date=mdate-1000
>2         Address1 date=mdate-6
>3         Address1 date=mdate-5
>
>Records 2 and 3 should be marked, while record 1 should not, right?
>You update would mark all of them as duplicates.
>
>BTW, I do have index on address. See my other post with the whole program.
>
>>>You group by ID field! :) You will have then all records from the table :)
>>
>>Sure ;)
>>Than you will not be able to use this query even when you will remove ID field from grouping because ID will be returned incorrectly from subquery. Need another approach:
>>
>>Update MyTable
>> set MyTable.flag='D'
>> where (MyTable.ccode+MyTable.town+MyTable.street+str(MyTable.strnum,4)+...) IN
>>(Select (MT.ccode+MT.town+MT.street+str(MT.strnum,4)+...) as cAddress
>> from MyTable MT
>> where MT.MyDateField betwen m.ldDate - 10 AND m.ldDate + 10
>> group by cAddress
>> having count(*)>1)
>>
>>Will work quite slowly unless you have index by (ccode+town+street+str(strnum,4)+...)
>>
>>
>>
>>>
>>>>Hi!
>>>>
>>>>>Nope, this would not work. Do you want me to explain why or can figure it by yourself? :)
>>>>
>>>> Well, expression should be (MT.ccode+MT.town+MT.street+str(MT.strnum,4)+...), right? I did not tried such king of queries, so will be glad to know exactly what is wrong ;)
>>>>
>>>>>
>>>>>>Still the same idea ;))
>>>>>>
>>>>>>Update MyTable
>>>>>> set flag='D'
>>>>>> where MyTable.IDField IN
>>>>>>(Select MT.IDField
>>>>>> from MyTable MT
>>>>>> where MT.MyDateField betwen m.ldDate - 10 AND m.ldDate + 10
>>>>>> group by (ccode+town+street+str(strnum,4)+...), MT.IDField
>>>>>> having count(MT.IDField)>1)
>>>>>>
>>>>>>I hope this will work, I'm not sure if we can use exression in GROUP BY list.
>>>>>>
>>>>>>
>>>>>>>>Hi!
>>>>>>>>
>>>>>>>>The same idea. 'Where' will filter out records for count() function.
>>>>>>>>
>>>>>>>>Select IDField, count(IDField) as nCount
>>>>>>>> from MyTable
>>>>>>>> where MyDateField betwen m.ldDate - 10 AND m.ldDate + 10
>>>>>>>> group by IDField
>>>>>>>> having nCount>1
>>>>>>>
>>>>>>>Hi Vlad,
>>>>>>>
>>>>>>>Actually, I need to mark all duplicate records in myTable (set flag='D'), which have the same address (ccode+town+street+str(strnum,4)+...) and date within the range. How can I do this?
>>>>>>>
>>>>>>>My current solution is quite combersome, so I'm open for the suggestions.
>>>>>>>
>>>>>>>Thanks in advance.
>>>>>>>>
>>>>>>>>>>>I'm switching an index to a primary and am getting a duplicate index message. How can I find the duplicates amoung 1000's without
>>>>>>>>>>>looking through them all?
>>>>>>>>>>
>>>>>>>>>>Select IDField, count(IDField) as nCount
>>>>>>>>>> from MyTable
>>>>>>>>>> group by IDField
>>>>>>>>>> having nCount>1
>>>>>>>>>
>>>>>>>>>This question was asked so often, what it definitely should be in a FAQ.
>>>>>>>>>
>>>>>>>>>BTW, a little more complicated variation of this problem:
>>>>>>>>>Find all duplicates, which have the same Id field and date within date specified +-10.
If it's not broken, fix it until it is.


My Blog
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform