A database challenge - Level Extreme

Plateforme Level Extreme

Abonnement

Profil corporatif

Produits & Services

Support

Légal

English

A database challenge

Message

04/03/2008 01:46:12

Walter Meester
HoogkarspelPays-Bas

03/03/2008 15:23:16

Kevin Goff
Pennsylvanie, États-Unis

Information générale

Forum:

Visual FoxPro

Catégorie:

Base de données, Tables, Vues, Index et syntaxe SQL

Titre:

Re: A database challenge

Divers

Thread ID:

01295541

Message ID:

01298352

Vues:

>First off all, I have not tested this on SQL5, so I can only deduct from what I know of the optimizer.
>
>Walter, the point is that twice you thought table scans would be done - neither time was that correct. OK, now you're saying you're basing this on what you know of the optimizer (and from a prior version of SQL Server, correct?)

Hmmmm, can you point out where?? AFAIK, I said table scan or index scan (which is depended on the indexes available).

>Now, both of our solutions wind up doing index scans in many instances, depending on the data. For example, a not-rare user lookup might be to find someone in NY who lives on "something Island" and is on a Blvd:
>
>

select FirstName, LastName, Address, City, State, Zip
>     from Customers where state= 'NY' and city like '%ISLAND%'  AND ADDRESS LIKE '%BLVD%'

>
>In this case, the execution plan (I'm sure you won't be shocked by this) does an index scan on city, an index seek on state, a sort, a key lookup, and a filter.

>(As I'm sure you know, there are other instances where an index scan would occur, that's just one).

In the case above an whole index scan is the best you could achieve because you're doing a wildcard search within a field (the '%' at the beginning). When you do a

City LIKE 'ISLAND%'

note that there is no wildcard at the beginning, the optimizer can scan or seek a very small range of the index (skipping unneccesary entries). With a million records or more this mean significant less IO and CPU (you can not trust the profiler to state the number of reads because cached reads won't display, unless you restart SQL server each time, duration should be reliable if doing it locally because it includes network I/O duration).

>The following is a "hunch", I won't state it with certainty - I think your approach maybe be a tad faster than mine when the core filter result is fairly small, whereas mine might be a tad faster as the core result sets increase. But it's just a guess.

Normally that is up to the optimizer. The optimizer might decide to do an index scan rather than an index seek when the statitistics indicate low selectivity. You are enforcing an index scan, whereas in my approach the optimizer has a choice. If it thinks it is best to do an index scan instead of an indexseek, it will do so. The optimizer of SQL server is quite smart in this respect. You of course still could force one or another with index or join hints, but I would not recommend it.

>I'd not do it like that. If the Columns do not contain null values, you can do the following:
>
>SET @City = ISNULL(@city, '%')
>SET @Address = ISNULL(@Address, '%')
>SELECT * FROM Clients WHERE City LIKE @City AND Address LIKE @Address
>

>I actually like that - it didn't appear to impact the execution plan, but I like that for readability (and yes, THERE I agree with readability).

The altering of the execution plan is depended on something called parameter sniffing. In this case it did impact a few times I tested and it did not afterwards. It might have something to do with parameter sniffing. Just test the following and look at the execution plan:

SET @City = 'NEW%'
SET @Address = '%'
SELECT * FROM Clients WHERE City LIKE @City AND Address LIKE @Address

versus

SELECT * FROM Clients WHERE City LIKE 'NEW%' AND Address LIKE '%'

In SQL2000 the first would do an index scan, a lookup, and filter to get the result. The second will skip the lookup and filter as it would totally disregard the filter on Address.

I'm guessing parameter sniffing would do the trick here, because it is behaving like I decribed here in a certain stored procedure I've got.

>It would depend on a large number of factors. With the trick I gave you above there would almost be no difference at all, because the database would execute the exact same execution plan (because of eliminating conditions).
>
>There wasn't a difference to begin with. Again, I like the code change for readability, but there was no real performance difference to begin with.

In the case of using an '%' at the beginning of a search expression (unless it is just '%') there would be no difference, an index scan is forced anyways.

Répondre

Fil

Voir

Click here to load this message in the networking platform