Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Primary and Candidate
Message
From
07/08/2001 12:25:16
Keith Payne
Technical Marketing Solutions
Florida, United States
 
 
To
03/08/2001 04:08:48
Walter Meester
HoogkarspelNetherlands
General information
Forum:
Visual FoxPro
Category:
Databases,Tables, Views, Indexing and SQL syntax
Miscellaneous
Thread ID:
00538812
Message ID:
00540770
Views:
15
>Hi keith,
>
>>>
>>>FromPC I, ToPC I, Distance I
>>>1614      1616    2101  (meters)
>>>1614      1617    4601
>>>1614      1618    6708
>>>....
>>>This table contained about 16 million records and is about 200 MB in size. What am I going to do with surrogates here ? If I would force surrogates the table would be split into two:
>>>
>>>
Table1:
>>>Pc_pk I, Pc_pstfk I, Pc_pstfk2 I, Distance I
>>>1        1           2            2101
>>>2        1           3            4601
>>>3        1           4            6708
>>>....
>>>
>>>table2:
>>>Pst_pk I, Pst_postalcode I
>>>1         1614
>>>2         1616
>>>3         1617
>>>4         1618
>>>....
>
>>Your example is not representative of a normalized database.
>The table more closely resembles the results of an aggregate query.
>
>But it is a table and cannot be contructed with a single query because the distance is not computable from any data in the system: in fact it is what I call a linking table. It links a distance to a combination of two postal codes. Such table is needed to determine the distance between two geographical seperated place troughout the netherlands. The table (in the first form) is perfectly normalized.
>
>>Most people would not include surrogate keys in output tables.
>
>I'm not sure what you mean by output tables. If your mean that leaf tables (tables without a child) have no need for surrogate keys, if some other key exist (In my example its the combination of FromPC and ToPc), you're right, and this exaclty the point i'm trying to make: Not every table needs a single integer surrogate key !!
>
>>Also, what is your viewpoint on partitioning a database with intelligent keys?
>
>I'm not sure what you mean by "Partitioning a database". I simply mean to tell that sometimes it makes more sense to use an intelligent key than a surrogate. For example a persons table:
>
>Pk I, Name, SSN, Sex, DayOfBirth, Postalcode, Street, housenumber, City, state, country
>
>
>To identify the record we use a surrogate primary key. No discussion about that.
>But imagine this table is used in a mortage application where different data applies for males and females (Males tend to die earlier). so then I've got to have a Sex table. O.K, Are you going to use a surrogate key here ?
>
>Imagine this table is going to be used in a horoscope application where the DayOfBirth is reffering to another table in which world wide events of that day are stored. Am I going to turn this field from date into a surrogate integer ?
>
>In the netherlands, If I have a complete postalcode table, I could extract the street and city automaticly. To normalize I should omit Street and City because this data can be computed with a join with the postalcode table. However, since my postalcode table does not contain info about foreign countries it might be wise not to do so, or else it will be impossible to enter the addres of these persons. This problem also arises when I try to turn the postalcode into a surrogate key.
>
>When not working with a postal codetable to look up streets and cities, You might want to use lookuptables for both cities and state. How are you going to define them:
>
>
Table State:
>PK I, code C(2), Fullname C(30)
>
>Table City:
>PK I, City C(30)
>OR
>
Table State
>Code (2), Fullname C(30)
>
>Table City:
>City C(30)
>
>Of course the same applies to country. I simply trying to tell:" Think if you need a surrogate or intelligent key in a particular situation". Many people here have an attitide of using surrogates everywhere while this might be complexing your application way more than use them when neccesary. If the described table (with all the describes childs) is written by a surrogate purist it looks like this:
>
>
>Table: Persons
>PK I, Name C(40), SSN C(15), Sex_FK I, DayOfBirth_FK I, PostalCode I, Street C(30), Housenumber C(7), City_Fk I, Country_FK I
>1   , Walter      10209348,   10456,   108756,          129487,       Majoraan,     37                13434,     23423
>Compare this with the mixed surrogate and intelligent key approach:
>
>PK I, Name C(40), SNN C(15), SEX C(1), DayOfBirth D, PostalCode C(7), Street C(30), Housenumber C(7), City C(25), Country C(25)
>1     Walter      10209348,  M,        07/07/1971,   1616 TP,         Majoraan,     37                Hoogkarspel,Netherlands
>
>We as programmers, are trying to make our programming code as readable as possible. This should also apply to our data: Don't use surrogate keys when not neccesary; it can make a mess of your database.
>
>Walter,

Walter,

I had written detailed responses to each of your points but I somehhow closed the browser without sending them! Unfortunately I have a lot to do today, so I can only re-write the conclusion:

We are all programmers, but I'd guess that less than 10% of us are experts at database design. We can all throw together a 3rd normal database fairly easily, but how many times do we go the extra mile to create a database that is as flexible as the code we write (5th normal form)? We design class libraries with layers of class definitions that we don't necessarily need right now, but our experience tells us we will need those layers in the future. It's the same with databases.

Ask yourself this question: When was the last time I updated a production database by adding tables without modifying the structure of any of the existing tables?

I know that this has rarely happened for me. I always end up adding a field to tie in the new tables somewhere. But a fully normalized database would not require any changes to the existing tables.

Sometimes a fully normalized database places unwanted strain on resources. Remember that you are supposed to design a fully normalized database FIRST, then remove the normalizations where the resource strain is highest. In this thread there was an inference that blindly using surrogate keys was a poor replacement for critical thinking. I'd say that using intelligent keys is a clear indication that a database is not fully normalized, and therefore has not been rigorously analyzed.

The only time it makes sense to not use a surrogate primary key is when we have not fully normalized the database.
Previous
Next
Reply
Map
View

Click here to load this message in the networking platform