Going to post for a few days on Ranking....this was added in SQL 2005, though to this day some people are using it for the first time..
create table TestScores ( StudentName varchar(50), Score int)
go
insert into testScores values ('John', 100)
insert into testScores values ('Mary', 100)
insert into testScores values ('Sam', 95)
insert into testScores values ('Mark', 95)
insert into testScores values ('Jane', 91)
insert into testScores values ('Alex', 89)
And we want to produce the attached result, which generates three different ranking numbers:
select StudentName, Score,
row_number() over (order by Score desc) as RowNum,
rank() over (order by Score desc) as RankNum,
dense_rank() over (order by Score desc) as DenseRankNum
from TestScores
From the result set, note the column called RowNum - this is generated using the function row_number(). Row_Number generates a relative position based on the supplied order definition.
row_number() over (order by Score desc) as RowNum
Row_number won't account for ties.....so since John and Mary both have 100, it generates values of 1 and 2, based on the order of the heap or the clustered index
It doesn't have great analytic value - but it's good for generating an index value for paging result sets.
Then we have the standard rank() function. RANK() has similar syntax - and this one accounts for ties. So both John and Mary (both with 100) are "ranked" as #1. However, the next person(s) (Sam, and also Mark) are ranked #3 for their score of 95. So there's no "#2". RANK() will leave gaps following ties. Maybe the business wants that kind of value....or maybe they want to close any gaps.
In that case, you use DENSE_RANK()...which also accounts for ties, but closes gaps. So in Dense_rank(), John and Mary are #1, Sam and Mark are #2, etc. So DENSE_RANK places more emphasis on the fact that 95 is the second highest unique score.
Tomorrow I'll talk about filtering on RANKed values. and also ranking within groups. I