Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
How do I find duplicate records?
Message
 
 
General information
Forum:
Visual FoxPro
Category:
Databases,Tables, Views, Indexing and SQL syntax
Miscellaneous
Thread ID:
00431045
Message ID:
00431643
Views:
12
Hi Vlad,

Once again: this update would not do what I want. Can you see? :)
1) You select correct duplicate addresses in your subquery.
Not let's imagine this situation: Address1 is the duplicate by my definition (it's duplicate and date within 7)
Record #  Address  Date  
1         Address1 date=mdate-1000
2         Address1 date=mdate-6
3         Address1 date=mdate-5
Records 2 and 3 should be marked, while record 1 should not, right?
You update would mark all of them as duplicates.

BTW, I do have index on address. See my other post with the whole program.

>>You group by ID field! :) You will have then all records from the table :)
>
>Sure ;)
>Than you will not be able to use this query even when you will remove ID field from grouping because ID will be returned incorrectly from subquery. Need another approach:
>
>Update MyTable
> set MyTable.flag='D'
> where (MyTable.ccode+MyTable.town+MyTable.street+str(MyTable.strnum,4)+...) IN
>(Select (MT.ccode+MT.town+MT.street+str(MT.strnum,4)+...) as cAddress
> from MyTable MT
> where MT.MyDateField betwen m.ldDate - 10 AND m.ldDate + 10
> group by cAddress
> having count(*)>1)
>
>Will work quite slowly unless you have index by (ccode+town+street+str(strnum,4)+...)
>
>
>
>>
>>>Hi!
>>>
>>>>Nope, this would not work. Do you want me to explain why or can figure it by yourself? :)
>>>
>>> Well, expression should be (MT.ccode+MT.town+MT.street+str(MT.strnum,4)+...), right? I did not tried such king of queries, so will be glad to know exactly what is wrong ;)
>>>
>>>>
>>>>>Still the same idea ;))
>>>>>
>>>>>Update MyTable
>>>>> set flag='D'
>>>>> where MyTable.IDField IN
>>>>>(Select MT.IDField
>>>>> from MyTable MT
>>>>> where MT.MyDateField betwen m.ldDate - 10 AND m.ldDate + 10
>>>>> group by (ccode+town+street+str(strnum,4)+...), MT.IDField
>>>>> having count(MT.IDField)>1)
>>>>>
>>>>>I hope this will work, I'm not sure if we can use exression in GROUP BY list.
>>>>>
>>>>>
>>>>>>>Hi!
>>>>>>>
>>>>>>>The same idea. 'Where' will filter out records for count() function.
>>>>>>>
>>>>>>>Select IDField, count(IDField) as nCount
>>>>>>> from MyTable
>>>>>>> where MyDateField betwen m.ldDate - 10 AND m.ldDate + 10
>>>>>>> group by IDField
>>>>>>> having nCount>1
>>>>>>
>>>>>>Hi Vlad,
>>>>>>
>>>>>>Actually, I need to mark all duplicate records in myTable (set flag='D'), which have the same address (ccode+town+street+str(strnum,4)+...) and date within the range. How can I do this?
>>>>>>
>>>>>>My current solution is quite combersome, so I'm open for the suggestions.
>>>>>>
>>>>>>Thanks in advance.
>>>>>>>
>>>>>>>>>>I'm switching an index to a primary and am getting a duplicate index message. How can I find the duplicates amoung 1000's without
>>>>>>>>>>looking through them all?
>>>>>>>>>
>>>>>>>>>Select IDField, count(IDField) as nCount
>>>>>>>>> from MyTable
>>>>>>>>> group by IDField
>>>>>>>>> having nCount>1
>>>>>>>>
>>>>>>>>This question was asked so often, what it definitely should be in a FAQ.
>>>>>>>>
>>>>>>>>BTW, a little more complicated variation of this problem:
>>>>>>>>Find all duplicates, which have the same Id field and date within date specified +-10.
If it's not broken, fix it until it is.


My Blog
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform