Surrogate vs natural vs artificial, clustered vs non

Level Extreme platform

Subscription

Corporate profile

Products & Services

Support

Legal

Français

Surrogate vs natural vs artificial, clustered vs non

Message

From

05/09/2014 09:27:22

Mike Yearwood
Toronto, Ontario, Canada

04/09/2014 20:30:18

Walter Meester
HoogkarspelNetherlands

General information

Forum:

Microsoft SQL Server

Category:

Indexing

Title:

Re: Surrogate vs natural vs artificial, clustered vs non

Environment versions

SQL Server:

SQL Server 2014

Application:

Web

Miscellaneous

Thread ID:

01603549

Message ID:

01607081

Views:

>>>>>I'd take an non-clustered sequential integer or bigint as the PK index. If I was able to rewrite all applications I created, I'd do that. Clustered index could be bery usefull on certain queries, but I'd be very carefull when applying that.
>>>>
>>>>Even integers can be generated twice, so I never understood the argument against GUIDs might be generated twice. The primary key option would prevent the duplicate key in both cases, and you'd have to generate another one and re-attempt the insert. A difference which makes no difference is no difference.
>>>
>>>Agreed. My main argument against a GUID is that it takes up quite a bit more space. Its less efficient. I'd use a GUID when records are created from disconnected systems.
>>
>>Well. GUIDs would help during company mergers. The NewSequentialID() function in SQL Server produces a GUID that is serialized to fill the index data pages, making inserts very fast, and more efficient that regular GUIDs. It takes up more space, than an int, but potentially far less than primary key on several data columns. :)
>
>Depending on how its stored, it at least takes 16 bytes to store and if you choose characters at least double of that. Again, not advocating compound keys in any way, but you can have 4 x integer columns for the same amount of space. Personally, space and performance (the more space an index uses, the more performance is impacted on larger number of rows) is more important to me.
>
>I'd avoid GUIDs when not having to deal with distributed data sources and I'd probably will go a long, long way to avoid using them as a PK (as opposed to an AK) at all cost.
>
>Walter,

Here's part of why I'm raising the discussion. How much time is actually wasted discussing 16 bytes versus 4 (int) or 8 (bigint)? This is an insignificant detail in the grand scheme. I've seen systems abusing scalar udfs crawling on their knees and crashing. That's a bigger problem (especially when it was a college professor that built the system)

Do dry-wallers debate using screws one third shorter/longer? Int limits the number of possible combinations, bigint must be unacceptable because it's twice the size of Int! Boo hoo! Let's use tinyint! RAM and drive space are cheap. I do think it would be unwise to mix GUID + int + bigint in the same system. Why not just pick one and stop trying to decide this per system or per table?

Some people don't even put a primary key on every table! The argument that the GUID can generate duplicates is not even valid, because as a PK a duplicate will be prevented.

Jimmy Nilsson started the Comb guid and since then Microsoft added the NewSequentialID function. These IDs are specifically sorted to benefit SQL Server's indexer and efficiently store keys in the index pages without the page splitting of regular totally random guids. By the way, guids do not need to be totally random - they just need to be totally unique.

Do newsequentialid keys take four times as long to query as int? If it's only a tiny percentage longer, what are the savings to an organization to not debate this regularly, to build systems with a reliable standard, a set of code that supports that standard, instead of several possible standards?

Map

View

Click here to load this message in the networking platform