>>Do you have nulls in either of these fields in these tables?
>
>No
Hi Michel,
I must admit I asked this question in another place since I found this question to be very interesting. Here is an answer from the great SQL Server expert George Mastros:
------------------------------------------------------------------------------------------------------------------------------------------------------
What you are describing is called a semi-join. There's plenty of information on google about it.
Here's one:
http://blogs.msdn.com/craigfr/archive/2006/07/19/671712.aspx Basically, if you don't need to use any of the data from the 'semi-joined' table, but want to return rows from the main table where row exists in the semi-joined table, it will perform better. Think about it this way, when you join tables, there can be multiple matching rows (think customers -> orders). A customer can have many orders. If you join the tables, SQL Server must match up all the columns from both tables that match the ON clause for the join. By using a semi join (your exists statement), SQL simply needs to check if the any row exists. If it does, then return that row. If not, it's filtered.
There biggest advantage to Exists is that it doesn't return any rows and can stop looking as soon as it finds a single match.
The performance improvement with exists (over a real join) varies depending on your data. If you have a one-to-one relationship, you will probably not notice much improvement in performance. However, if there are thousands of rows in the child table for each row in the parent table, the performance improvements can be huge because SQL doesn't need to match up all the rows, just check for the existence of a single row.
If it's not broken, fix it until it is.
My Blog