Level Extreme platform
Subscription
Corporate profile
Products & Services
Support
Legal
Français
Surrogate vs natural vs artificial, clustered vs non
Message
From
09/09/2014 06:34:22
Walter Meester
HoogkarspelNetherlands
 
 
To
06/09/2014 08:16:23
Dragan Nedeljkovich (Online)
Now officially retired
Zrenjanin, Serbia
General information
Forum:
Microsoft SQL Server
Category:
Indexing
Environment versions
SQL Server:
SQL Server 2014
Application:
Web
Miscellaneous
Thread ID:
01603549
Message ID:
01607206
Views:
41
>>>>>I also have to admit to a change I needed to make. I used to accept the default of clustering on the primary key. It seems the only need for a clustered key is when you absolutely need a set of records returned in a cluster, such as all the lineitems for a given invoice. The invoice foreign key should be clustered on the line items table, not the primary key on either invoice nor lineitems. Right?
>>>>
>>>>That is better if you're doing a lot of (demanding) queries on them... correct. There might be nuances to this rule, but in general I'd say that is correct.
>>>
>>>I'd rather have these clustered by item code. By invoice, you generally need only when you're totaling and printing it, which is just about once or twice per document. Calculating and reporting, which may be done monthly and/or annually or even several times a day (some apps require recalculating stock on every change) requires access to records of the same item code.
>>>
>>>IOW, yes, cluster on the most frequently used order - just saying that the document number is not the most likely candidate for it.
>>
>>Indeed, it depends on the queries you want to optimize the most. Whether that is an invoiceno, invoicedate, itemcode or something else, is a decision that has to be made by the developer or DBA
>
>And I'd like to be more precise in what I meant, why document number is a bad choice. You need it only when editing/saving a document, and when printing it out. And that's when you don't really need the speed. Even the slowest method will still be faster than the printer and far faster than the data entry. And whatever slowness there may be, it's experienced by one user, the one manipulating the document, and they are well aware that saving it may take some time. Even office type software may take its time to save one document, even when it's a single file (well, seemingly - it's about 20 xml files in subfolders, zipped). However, about every user working with line items will like to have the line items available immediately - be it just viewing the history of an item (purchases, sales, stock) or waiting to pull the totals for a period (which used to be immediate with dbfs 20 years ago - don't see why it wouldn't be today).
>
>The definition of "most frequently used order" needs also to take not only the number of times it is used, but also the number of records read, IOW the total I/O.

Choosing a clustered index should always be done with the motivation to save I/O in the overall scheme of things. You'll have to determine where you need the optimal performance, and indeed that seldomly is at retrieving data for a single invoice / document but almost always in datamining functions. IOW, if you've got queries that perform a lot of I/O you might want to increase the performance in making the parts that retrieve ranges of values be able to use a clustered index.
Previous
Reply
Map
View

Click here to load this message in the networking platform