Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Finding mismatched data
Message
From
03/05/2012 10:32:15
Dragan Nedeljkovich (Online)
Now officially retired
Zrenjanin, Serbia
 
 
To
03/05/2012 09:45:30
General information
Forum:
Microsoft SQL Server
Category:
SQL syntax
Environment versions
SQL Server:
SQL Server 2008
Miscellaneous
Thread ID:
01543096
Message ID:
01543108
Views:
30
>No, I don't want similarly spelled names to show as a match, I'm trying to show anything that does not match so that the user can manually fix it. Here's some sample data, cman is the database at Headquarters and cmanSO is the database at the South Office.
>
>cman Agents
>
>Name          ID
>Frank          1
>Fred           2
>Naomi          3
>Nadya          4
>Justin         5
>Jonathan       6
>
>cmanSO Agents
>
>Name          ID
>Frank          1
>Fred           2
>Natasha        3
>Naomy          4
>Nadya          5
>
>My logic so far is that I need to first of all find those agents that have the same ID but different name:
>
>
SELECT HQAgents.Name as HQName
>		,HQAgents.ID as HQID
>		,SOAgents.name as soname
>	from cman.dbo.Agents hqagents join cmanso.dbo.agents soagents
>		on hqagents.ID = soagents.ID and hqagents.Name <> soagents.Name
>	order by HQName, soname
>
>So I should get this output:
>HQName ID SOName
>Naomi 3 Natasha
>Nadya 4 Naomy
>Justin 5 Nadya
>
>Then those that have the same name but different ID:
>
>
SELECT HQAgents.Name as HQName
>		,HQAgents.ID as HQID
>		,SOAgents.id as soid
>	from cman.dbo.Agents hqagents join cmanso.dbo.agents soagents
>		on hqagents.ID <> soagents.ID and hqagents.Name = soagents.Name
>	order by HQName, soname
>
>Output
>
HQName    HQID   SOID
>Nadya         4        5
>
>Then those that exist in cman but not in cmanSO based on Name (Jonathan in this case) and those in cmanSO not in cman based on name.
>
>I then have to do something about manually cleaning up any misspellings and then programmatically build up a new Agents table and somehow sort out the related tables to use the new Agent ID.
>
>I hope that explains it a bit clearer, but it's still murky in my head as to what to do.

Been there, and still have some hair left. Don't even remember how each case was solved, probably differently each time. My got feeling is that you're on the right track. So, just a small suggestion for the next step:

List anything that's not 100% matched, and save the pairings in a table. When the next batch comes in, use these pairings to clean up what was already matched so far (for example, it was already established that Jonhy and Johnny are the same person, and you have that in the table), so show only the new mismatches.

Eventually you'd end up with "Johnny is called Jonhy at location A, and Jhony at location C"... a sort of dictionary of various forms of the same names, that translate into one that you are taking as right.

back to same old

the first online autobiography, unfinished by design
What, me reckless? I'm full of recks!
Balkans, eh? Count them.
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform