PK and deleted records - Level Extreme

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

PK and deleted records

Message

17/01/2002 03:56:10

Walter Meester
HoogkarspelPays-Bas

16/01/2002 16:02:57

Hilmar Zonneveld
Independent Consultant
Cochabamba, Bolivie

Information générale

Forum:

Visual FoxPro

Catégorie:

Base de données, Tables, Vues, Index et syntaxe SQL

Titre:

Re: PK and deleted records

Divers

Thread ID:

00605083

Message ID:

00605969

Vues:

Hilmar,

>I think they do. The index for the candidate key can easily be filtered - for the primary key, this is a greater problem, because the index in the PK is more important for Rushmore (especially for joining tables).

Then add an extra normal index on that primary key. If you add a surrogate you probable want to index that one also, so you won't save indexes if you apply surrogate keys. In fact If you require the candidate to be unique, you need a a filtered candidate key. If additionally you need to optimize queries using this candidate key you'll have one index more that using the intelligent key.

>In the candidate key, you might not even need an index.

If this was the case, they can solve the problem by having a normal index for the primary key at the first place.

Bottom line is, that using surrogates is just a total different concept than using intelligent keys. I can't believe people trow in the concept of surrogate keys as a solution to this problem, because:

1. In a existing project with lots of development hours, you cannot, I repeat, cannot move from intelligent to surrogate keys. It will have tremendous effect on the whole application:

a. All table structures have to be changed.
b. All views have to be redefined.
c. All queries have to be rebuild.
d. The whole application has to be scanned for refference to the PK.

The solution surrogates, how appropriate as it may be for a given problem, is in most cases not a valid one. It will result in tremendous efforts to get the application working again as opposed to applying the FOR NOT DELETED() filter which can be done in a matter of seconds or minutes without any changes in the application.

2. As said before, the intelligent key becomes a candidate. Most likely the candidate has to be checked for uniqueness with a (filtered on NOT DELETED()) candidate key. For rushmore optimization you might add another normal one. Ending up with more indexes as in the intelligent key approach.

3. Unlike some guru here want you to believe, surrogates are not the mythical silver bullet. They also come with problems. Performance is one of them because surrogates ussually require more joins. Integrity checking might be another one. Troubleshooting, testing and maintenance might be a thrid. If the domain says that I cannot not change an Itemno whenever the item has been sold to a client, with intelligent keys, I could implement this with RI; with surrogates I'll have to write my own checking routine, which in most cases tend to be buggy. Don't get me wrong. Surrogates have their advantages, they have disadvantages also.

4. If your database evolves over the time, it might be better to use intelligent keys only (however: keep in mind that intelligent keys might also be generated which value only has to identify the record, see message Thread #601821 Message #603370) because you might want to declare an existing column as a foreign key which corresponds to a PK in a new table (Example: Declare the postalcode column as a foreign key in a new postalcode table). If you want to apply surrogates here, you'll have to do a conversion and adapting your application.

One of the most stupid argument about surrogate keys is, that its advantage lies in the fact that it allows you to change the intelligent candidate key without breaking RI. This is unthoughfull because:

1. We've got a RI mechanism that is called cascading update. This allows you to change a primary key, and the database cascades the update troughout all tables. Though this is absolutely slower than a surrogate key approach, it has the advantages that you can easely reject the update with a RI restrict. (You cannot change an itemno if it has been sold already).

2. In many cases, you don't even want the primary key to be changed. For example: you don't want invoicenumbers to be changed. You don't want to change itemnumbers when you've distributed cataloques to clients. you don't want to change a debno, because it will mess up you already printed statistics.

Then of course there is the real popular phrase, "DATE said 'Intelligent keys' aren't". This phrase is about 15 to 20 years old (maybe even older). Is it possible that time closes in on this issue? Is it possible that this statement was done because of limitations of available DBMSs, hardware or even experience ? Is it possible that the definitions of the terms Intellingent, surrogate, auto generated have changed ? AFAIK Date does not build database systems and only speaks out of his believes. Joe Pesco OTOH, is an SQL guru publishing in a.o. DB magazine and is regarded as a real expert in solving practical SQL problems. He recognizes the disadvantages of surrogates. Unlike Date, I still read a new article of Joe pesco every month in DB/M (dutch DB magazine). I can't believe that some VFP prominent use this phrase to prove they are right. If president bush says "VFP is slow", and all developers and managers are saying: "We don't do VFP development anymore, because president bush says VFP is slow", how would you as VFP react ? AT THE LEAST PROVE YOUR STATEMENT !!!! this is an ultimate argument of a non-statment.

Third, people try to relate this problem to the relational theory, which would say "Each record should have a unique PK.". And because VFP tables can have deleted() records, their PK value should be unique too. This is the greates humbug i've ever heared. First of all the phrase should be "Each tuple should have a unique PK". further the Relational Theory states that Records and tuples are not the same. Records are product specific implementations of tuples. For VFP applies that tuples are implemented in non deleted records. So deleted() records do not fall into the domain of the relational theory because in most applications they are regarded as non-existant. As a result the PK from the relational model should be implemented on non-deleted() records (so they do not play a role in RI). Conclusion: a PK should have the FOR NOT DELETED() filter. The practical problem that they are not optimizable is sad, but does not have anything to do with the issue (In the VFP 8 wishlist i've submitted an ER for making filtered indexes optimizable because AFAIK there are no logical objections to it).

I simply can't believe, that people still believe in the surrogate only humbug, while it is so clear that there also are disadvantages. Over the years we've talked about this subject numerous times. Each time it was recognized that surrogates might not be the best in every case, still there are some people persisting this unbelievable stupid standpoint. I could easely draw a parrallell to the use of the DELETED() tag which should boost rushmore's performance. It took me a few weeks to convince some high rated VFP experts that this was a missconception. In this case, I seem to fight a fairly lonely battle about this unbelievable dumb standpoint.

Walter,

Répondre

Fil

Voir

Click here to load this message in the networking platform