Bypassing the 16 field limit in index

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

Bypassing the 16 field limit in index

Message

03/02/2014 11:15:20

Mike Yearwood
Toronto, Ontario, Canada

01/02/2014 17:09:58

Walter Meester
HoogkarspelPays-Bas

Information générale

Forum:

Microsoft SQL Server

Catégorie:

Indexation

Titre:

Re: Bypassing the 16 field limit in index

Versions des environnements

SQL Server:

SQL Server 2008

Application:

Web

Divers

Thread ID:

01592711

Message ID:

01593005

Vues:

>Mike, I think you misunderstand one important issue of compound indexes.

Walter. I am not misunderstanding anything. Michael is trying to allow the users to query any combination of 24 columns. Michel is finding that some composite keys are being used when he runs certain queries. I am trying to point out that SQL will use composite keys when they may not be best, and even if it must use a full index scan.

>
>Only the first column of the compound index can be directly used for an index seek. Therefore it is very important which index is first in the compound index.
>The other indexes can only be used in seeks in combination with ALL the indexes higher in the list.
>In other cases, those indexes can only be used in full index scans, which are much more I/O intensive. Therefore execution plans are often disregarding the compound index if the search column is not the first column in the index list.
>
>If the index to search upon is frequently used in joins, it might be worth to make that column the clustered index. Since the leaf nodes of the clustered index are the actually data rows in the table, it does not take much extra resource to retrieve any other column in the same table. If the database is not highly fragmented it would also have the benefit that the rows can be retrieved from traditional HDDs more efficiently since all the values of a particular index value are physically grouped together and the spinning disk can collect the data more efficiently from the magnetic disc.
>
>
>>>>In VFP I would have indexed the columns the user might query on. Fox would build the bitmap for each column in the where, and then AND or OR them as the where clause conditions stated. Have you tried making single indexes per column without including other columns (not covered)? SQL Server does a similar thing as VFP - it does an index seek on each column referenced in the WHERE, then it would Merge Join those results, then do a Key Lookup to get the record data - if you have a clustered primary key.
>
>Whether the clustered key is primary or not is irrelevant. Even it the clustered index is not unique, it would do the same. However the key will be padded with an additional unique identifier. Even with heap tables (tables without a clustered index) it still would do a key lookup that refers to the actual row number.

Thanks for clearing that up.

>
>>>Each of the field have their own single index. Those ones do not have included fields. There is also indexes which combined various fields.
>>Your users can decide to query on any combination of up to two dozen columns? My preferred use case is to build dynamic SQL and allow the user to query any way they want. SQL will build all the execution plans and cache them based on the queries your users construct. Take the user's query and sort the columns alphabetically. That will increase the chance of reusing an existing execution plan.
>
>Why would sorting the columns alphabetically help in reusing the execution plan? SQL server can only reuse an execution plan if the query is exactly the same. SQL server keeps a CRC/hash table of the text of the query to quickly determine whether a query is the same as one used before.
>With parameters in the query, the parameter values are excluded in the comparison.

Yes. You're right. I knew that. I remembered sorting them in alpha order but that may just have been a debugging aid.

>
>>My understanding is use INCLUDE to let SQL return column data from the index without accessing the table - covering the query. That is going to be complex if you have 2 dozen indexes.
>
>>If your user says field1 = 'text1' and field2 = 'text2' then you only need simple indexes on field1 and field2. You do not need a composite index. Having the composite index allows SQL to use it. In this example
>>
>>CREATE NONCLUSTERED INDEX [xfield2field1] ON [dbo].[testtable] ([field2], [field1])
>>CREATE NONCLUSTERED INDEX [xfield1] ON [dbo].[testtable] ([field1])
>>CREATE NONCLUSTERED INDEX [xfield2] ON [dbo].[testtable] ([field2])
>>
>>With this query
>>
>>SELECT * FROM [dbo].[testtable] where field2='text2'
>>
>>the execution plan shows that it used index xfield2field1. I suppose SQL uses the first index it can find. After deleting the xfield2field1 index, the above sql used index xfield2 which is what I would have preferred. It's basically how Fox optimizes. I do not see any large benefit to it using xfield2field1.
>
>The cards would be different if you execute a query
>
>SELECT * FROM TestTable WHERE Field2 = "Test" AND Field1 = "Me"
>
>This is because in the scenario of single field indexes, SQL server might decide (depending on the internal statistiscs and uniqueness of either field) to use two indexes, having to merge them before getting the result bitmap.
>
>In the case of an compound index, it would only have to process the one index and will find the result much quicker in that as well (less I/O).

But how can anyone create all the compound indexes to handle every combination of 24 columns? It's possible by creating a script, but that seems insane. That's what I'm trying to tell Michel. I said I do not see any benefit to it using xfield2xfield1 and you confirmed that it would end up doing a full index scan. So again, single field indexes look like the way to go.

>
>
>>It also can use xfield2field1 to handle this query.
>>
>>select * from [dbo].[testtable] where field1='text1'
>>
>>If you do not have the simple index on field1 it will advise you to add it. Why? Are composite indexes not good enough?
>>
>>There may be a little benefit in using one composite index to handle several queries. How long would it take sql to search the composite index of 16 keys to find the 10th key is the one to use?
>>It must have to bypass 14 of the keys to find the two keys needed below.
>
>Do not worry about the time to find the key. That is immaterial. Its the usability of the key. SQL server will have to process the whole index (all index rows) to find the value you are looking for. Taking the example above:
>
>SELECT * FROM testable WHERE Field1 = "me"
>
>Since field1 is in the second position of the index, and therefore has to scan the whole index from top to bottom to find all instances of "me", it is far less efficient than a single index on Field1, in which case a single seek or limited range scan can be used to get all rows.
>
>In short: The usability of a field within an index is determined on whether its the first index in an compound index. Only the first index can be seeked upon independed of any earlier field within the index, because of the way the index is sorted.
>
>
>>select * from [dbo].[testtable] where field1='text1' and field16='text16'
>>Is one composite index of 16 keys better than 16 simple (non-composite) indexes? In your case, you have two dozen keys. So that means you'd need at least >2 composite indexes (16 keys in one and 8 in the other). Are two composite indexes better than 2 dozen simple indexes? My *guess* would be to add only >the 2 dozen simple indexes, avoid composite keys and not INCLUDE any other columns.
>
>It is not that simple. Compound indexes can be very helpful in cases where it is fully covering. SQL server then never has to touch the table. In the example above.
>
>SELECT Field1, Field2 FROM testable WHERE Field2 = "Test"
>
>With the proposed compound index on field2,field1 the query is now fully covered by the compound index. SQL server will not touch the table at all and will return the result in the most efficient way possible in SQL server.

But to handle 24 columns, Michel will have to have every permutation of the columns as separate indexes.

>
>
>However stretching it to 24 fields sounds to far-fetched for me, since it will take up much more disk space with its own set of problems. There must be better ways of optimizing the query then.

So you are agreeing trying to build composite keys to handle 24 permutations is a bad idea?

>
>>If you have 2 composite indexes and your user queries field1='text1' and field24='text24'
>>
>>SQL will have to exclude 22 keys across 2 composite indexes. It will have to merge join the results. Can that be any faster than using indexes field1 and field24 and merge joining the results? The latter sounds simpler to me. What about 3 indexes? Is that only one merge join?
>>
>>The 24 indexes will do exactly what I expect. For 2 columns it will do 2 index seeks and 1 merge join, for 3 it will do three index seeks and two merge joins. I have not done any timing tests. Maybe 2 fully covered composite indexes are better? Can SQL update two composite indexes faster than 24 simple indexes? That may be a benefit too. It would be a nice article contrast 2 fully covered composite indexes versus 24 non-covered simple indexes versus 24 fully covered simple indexes.

Répondre

Fil

Voir

Click here to load this message in the networking platform