Information générale
Catégorie:
Codage, syntaxe et commandes
Okay, am tired looking at your list :-)
This is plaintiff data, relatively small (only about 50k records after 40 years in business). And we don't get the address, phone data etc. until the file is settled (years into the process) - not that I would have wanted to have had to add that into the mix.
My guess is that it would have taken a LONG time to work all that code into your searches.
What is "edit distance" - sounds useful (for what, I have no idea).
Albert
>yes to all except for 3,4 and 13.
>Not PHdBase, but 2 similar approaches
>Plus a few more, like calculating edit distance as they call it today ;-)
>Plus doing the same on adress fields
>plus contact data (multiple phone, email...)
>Plus a rule engine which can be tweaked for certain profiles of data
>Plus a scoring system so you can order the "relevance" estimated for further tweaking.
>
>used to run regularly on 1 - 9 million data entries, linked across more than a couple of related tables.
>looking for duplicates, weeding/singling out family groups
>target marketing
>
>system grew over a few years ;-)
>
Précédent
Suivant
Répondre
Voir le fil de ce thread
Voir le fil de ce thread à partir de ce message seulement
Voir tous les messages de ce thread
Voir tous les messages de ce thread à partir de ce message seulement