Primary and Candidate - Level Extreme

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

Primary and Candidate

Message

02/08/2001 13:49:53

Walter Meester
HoogkarspelPays-Bas

02/08/2001 11:21:37

Hilmar Zonneveld
Independent Consultant
Cochabamba, Bolivie

Information générale

Forum:

Visual FoxPro

Catégorie:

Base de données, Tables, Vues, Index et syntaxe SQL

Titre:

Re: Primary and Candidate

Divers

Thread ID:

00538812

Message ID:

00538981

Vues:

Hi Hilmar,

>Oh, I forgot what I actually consider the main advantage. The user can change the SSN - or any other code, visible to the user, that identifies a record (examples: client number, product code), without propagating changes to other tables.

>And of course, there are disadvantages, as well. Programming can be more cumbersome in some cases. Codes visible to the user have to be looked up - this makes some things slower.

Imagine, you've got a table containing 10 to 20 codes which have relations to other tables. Joining 10 to 20 tables for a report is not exactly what I call fun. Sometimes it's better not to use surrogates. Esspecially when there are short codes (a few characters or numeric positions) that have meaning outside the database system (for example: The police uses codes to quickly communicate with eachother). Often, these codes make reports more readable. In a complex report I really like to see codes like WA, NY, CA in stead of Washington, New York or California.

>But I firmly believe that the advantages outweigh the disadvantages. I used to program without these "surrogate keys" a while ago. My previous boss told me "You are violating a key database concept". It took me a while to see the advantages, but now you can consider me a "convert".

Another big disadvantage is debugging, ad-hoc reporting, and quick hacking problems. If you've got some tables that contain a high number of surrogate keys these task are far more difficult than with intelligent keys. With surrogates its not easy to find your way trough integer related fields in lookup tables. Take one of your old projects and see how easy it is to debug and create ad-hoc reports from the commandline.

In distributed database systems, a single incrementing integer field is not a good idea. When changes are replicated from one to another database, conflicts can easly occur. There should be different numberranges to avoid problems here. Some use 16 byte GUIDs but they take a lot of space. However if changes are replicated from an unknown number of sites, it might be applicable.

When adding one database to another surrogates can make your life to a hell. You can't append one table to the other because primary key uniqeuness conflict can easly occur. Also the same entities (E.g. Article "BIKE"), have different primary keys and as a result the system sees them as two different articles or violates the candidate key.

The autonumbering mechanism we use in VFP is not foolproof either. Too often i've been bitten by the fact that my PK table grew out of sync (because of conversions, or joining external data etc.), causing uniqueness errors at the user. Because surrogate are not visible thus not controllable from the users site it can only be solved by a synchronizing routine which makes thing worse if a replicated table still holds a deleted record which has been erased by a pack of the source table.

Then there is of course also the matter of size. Just yesterday I recieved a postalcode - distance table on CDROM. It simply looks like this.

FromPC I, ToPC I, Distance I
1614      1616    2101  (meters)
1614      1617    4601
1614      1618    6708
....

This table contained about 16 million records and is about 200 MB in size. What am I going to do with surrogates here ? If I would force surrogates the table would be split into two:

Table1:
Pc_pk I, Pc_pstfk I, Pc_pstfk2 I, Distance I
1        1           2            2101
2        1           3            4601
3        1           4            6708
....

table2:
Pst_pk I, Pst_postalcode I
1         1614
2         1616
3         1617
4         1618
....

Not anyone here is going to convince me that this is better than the first. Not only grew my table with 61 MB, but I have to join tables to make it readable. Joining such large tables is not going to be fast in any way.

Don't take me wrong: surrogate keys have their advantages and I use them in the majority of my tables, but watch out to use them blindly. They can cause you trouble also.

Répondre

Fil

Voir

Click here to load this message in the networking platform