>We are either looking for duplicates for data cleaning or matching entries on partially erroneous data.
>So those 4 names would get high similarity in company name, but if no other criterion hints at a match
>(might be phone#, adress, person to talk to or similar stuff) those 4 will be seen as different.
>
>Problematic are things like
>x 001 trust fund
>x 002 trust fund
>x 003 trust fund
>
>living at the same adress with the same spokesperson : here we have to decide up front if a separate
>corporate identity will create a singular entry or all will be bundled as one site (for mailing/contact scheduling for instance).
>
>Hope that is clearer - if not, ask for specifics.
That's pretty much what is the conclusion we came up to yesterday afternoon.
Thanks for the additional information