Results 1 to 11 of 11

Thread: LEFT JOIN versus INNER JOIN

  1. #1

    Thread Starter
    VB Addict Pradeep1210's Avatar
    Join Date
    Apr 2004
    Location
    Inside the CPU...
    Posts
    6,614

    LEFT JOIN versus INNER JOIN

    In situations where both would produce same result (e.g. foreign keys on not-nullable fields etc.), which one should be preferred performance wise - INNER JOIN or LEFT/RIGHT JOIN?
    Pradeep, Microsoft MVP (Visual Basic)
    Please appreciate posts that have helped you by clicking icon on the left of the post.
    "A problem well stated is a problem half solved." — Charles F. Kettering

    Read articles on My Blog101 LINQ SamplesJSON ValidatorXML Schema Validator"How Do I" videos on MSDNVB.NET and C# ComparisonGood Coding PracticesVBForums Reputation SaverString EnumSuper Simple Tetris Game


    (2010-2013)
    NB: I do not answer coding questions via PM. If you want my help, then make a post and PM me it's link. If I can help, trust me I will...

  2. #2
    PowerPoster abhijit's Avatar
    Join Date
    Jun 1999
    Location
    Chit Chat Forum.
    Posts
    3,228

    Re: LEFT JOIN versus INNER JOIN

    Interesting question. I suppose a OUTER join would slow things compared to a INNER JOIN.
    Everything that has a computer in will fail. Everything in your life, from a watch to a car to, you know, a radio, to an iPhone, it will fail if it has a computer in it. They should kill the people who made those things.- 'Woz'
    save a blobFileStreamDataTable To Text Filemy blog

  3. #3

    Thread Starter
    VB Addict Pradeep1210's Avatar
    Join Date
    Apr 2004
    Location
    Inside the CPU...
    Posts
    6,614

    Re: LEFT JOIN versus INNER JOIN

    Not sure, but I think inner join should be slower because it has additional overhead of removing unmatched rows, which left/right outer joins don't have.
    Pradeep, Microsoft MVP (Visual Basic)
    Please appreciate posts that have helped you by clicking icon on the left of the post.
    "A problem well stated is a problem half solved." — Charles F. Kettering

    Read articles on My Blog101 LINQ SamplesJSON ValidatorXML Schema Validator"How Do I" videos on MSDNVB.NET and C# ComparisonGood Coding PracticesVBForums Reputation SaverString EnumSuper Simple Tetris Game


    (2010-2013)
    NB: I do not answer coding questions via PM. If you want my help, then make a post and PM me it's link. If I can help, trust me I will...

  4. #4
    A SQL Server fool GaryMazzone's Avatar
    Join Date
    Aug 2005
    Location
    Dover,NH
    Posts
    7,493

    Re: LEFT JOIN versus INNER JOIN

    I believe as abhijit does. The outer join performs slower the the inner join.
    Sometimes the Programmer
    Sometimes the DBA

    Mazz1

  5. #5

    Thread Starter
    VB Addict Pradeep1210's Avatar
    Join Date
    Apr 2004
    Location
    Inside the CPU...
    Posts
    6,614

    Re: LEFT JOIN versus INNER JOIN

    I tried to compare the two types like this:

    1. Created two tables:
    sql Code:
    1. USE [tempdb]
    2. GO
    3. SET ANSI_NULLS ON
    4. GO
    5. SET QUOTED_IDENTIFIER ON
    6. GO
    7. CREATE TABLE [dbo].[Table1](
    8.     [Column1] [int] NOT NULL,
    9.  CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED
    10. (
    11.     [Column1] ASC
    12. )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    13. ) ON [PRIMARY]
    14. GO
    15.  
    16. CREATE TABLE [dbo].[Table2](
    17.     [Column1] [int] NOT NULL
    18. ) ON [PRIMARY]
    19. GO
    2. Inserted some random records in both tables. Table1 is filled with numbers from 1 to 1000000, and table2 has random records selected from table1.
    sql Code:
    1. truncate table table1
    2. truncate table table2
    3.  
    4. declare @i int
    5. set @i = 1
    6. while @i < 1000000
    7. begin
    8.     insert into table1 values(@i)
    9.     set @i =@i + 1
    10. end
    11. select count(*) from table1
    12. go
    13.  
    14. insert into table2 select column1 from table1 where column1%3=0 or column1%7=0
    15. select count(*) from table2
    16. go
    17.  
    18. insert into table2 select column1 from table1 where column1%3=0 or column1%7=0
    19. select count(*) from table2
    20. go
    21.  
    22. insert into table2 select column1 from table1 where column1%5=0 or column1%11=0
    23. select count(*) from table2
    24. go
    25.  
    26. insert into table2 select column1 from table1 where column1%4=2
    27. select count(*) from table2
    28. go
    3. Created a stored procedure to benchmark results:
    sql Code:
    1. drop procedure BenchMarkPerformance
    2. GO
    3.  
    4. Create Procedure BenchMarkPerformance
    5. AS
    6. SET NOCOUNT ON
    7. DECLARE @Query1 nvarchar(2048)
    8. DECLARE @Query2 nvarchar(2048)
    9.  
    10. DECLARE @started datetime
    11. DECLARE @SampleSize bigint
    12. DECLARE @NumTime varchar(100)
    13. DECLARE @StrTime varchar(100)
    14. DECLARE @counter int
    15.  
    16. set @Query1 = 'select table1.column1 from table2 left  join table1 on table2.column1 = table1.column1'
    17. set @Query2 = 'select table1.column1 from table2 inner join table1 on table2.column1 = table1.column1'
    18. set @counter = 0;
    19.  
    20. -- ********** SAMPLE SIZE **********
    21. select @Samplesize = count(*) from table2;
    22. Print 'Sample Size : table2 = ' + cast(@SampleSize as varchar);
    23. select @Samplesize = count(*) from table1;
    24. Print 'Sample Size : table1 = ' + cast(@SampleSize as varchar);
    25. --***********************************
    26.  
    27. while @counter < 10
    28. begin
    29.     set @counter = @counter + 1;
    30.     dbcc freeproccache WITH NO_INFOMSGS ;
    31.     dbcc dropcleanbuffers WITH NO_INFOMSGS ;
    32.  
    33.     -- ********** STRING SEARCH ********
    34.     set @started = getdate();
    35.     EXECUTE sp_executesql @Query1;
    36.     set @StrTime = '    LEFT  JOIN: ' + CONVERT(VARCHAR(26), getdate() - @started, 114);
    37.     --***********************************
    38.  
    39.     -- ********** NUMERIC SEARCH ********
    40.     set @started = getdate();
    41.     EXECUTE sp_executesql @Query2;
    42.     set @NumTime = '    INNER JOIN: ' + CONVERT(VARCHAR(26), getdate() - @started, 114);
    43.     --***********************************
    44.  
    45.     print Cast(@counter as varchar) +  @StrTime + @NumTime;
    46.     waitfor delay '00:00:01'
    47. end
    4. Finally executed this stored procedure.
    sql Code:
    1. exec BenchMarkPerformance
    5. Results:
    Code:
    Sample Size : table2 = 1379869
    Sample Size : table1 = 999999
    1    LEFT  JOIN: 00:00:14:643    INNER JOIN: 00:00:13:483
    2    LEFT  JOIN: 00:00:14:280    INNER JOIN: 00:00:13:487
    3    LEFT  JOIN: 00:00:13:560    INNER JOIN: 00:00:14:017
    4    LEFT  JOIN: 00:00:13:330    INNER JOIN: 00:00:13:467
    5    LEFT  JOIN: 00:00:13:453    INNER JOIN: 00:00:14:453
    6    LEFT  JOIN: 00:00:13:390    INNER JOIN: 00:00:13:907
    7    LEFT  JOIN: 00:00:13:407    INNER JOIN: 00:00:13:500
    8    LEFT  JOIN: 00:00:13:403    INNER JOIN: 00:00:13:470
    9    LEFT  JOIN: 00:00:14:390    INNER JOIN: 00:00:13:547
    10    LEFT  JOIN: 00:00:13:423    INNER JOIN: 00:00:13:467
    So from above it seems like there is no difference in performance of the two and the whole exercise was a waste.
    Pradeep, Microsoft MVP (Visual Basic)
    Please appreciate posts that have helped you by clicking icon on the left of the post.
    "A problem well stated is a problem half solved." — Charles F. Kettering

    Read articles on My Blog101 LINQ SamplesJSON ValidatorXML Schema Validator"How Do I" videos on MSDNVB.NET and C# ComparisonGood Coding PracticesVBForums Reputation SaverString EnumSuper Simple Tetris Game


    (2010-2013)
    NB: I do not answer coding questions via PM. If you want my help, then make a post and PM me it's link. If I can help, trust me I will...

  6. #6
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,687

    Re: LEFT JOIN versus INNER JOIN

    In general, INNER JOINS are faster, as they can eliminate results as soon as a mismatch is found based on the join. At least that's been my experience.

    To answer the original question though, if you want to ALWAYS get the same results, then I'd go with the INNER JOIN .... BUT if you think you might want those mismatched records when they do occur, then go with the LEFT... the query should be based on the desired results... and users being an interesting lot, will find a way to hoark up the data and produce a mismatched result when you may (left join) or may not (inner join) want it.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  7. #7
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: LEFT JOIN versus INNER JOIN

    Syntax of query influences tuning but in the end it is the query execution plan that determines performance.

    Your sample/test queries are too simple and return the same result. Think of it this way, in a 5-table join if you intend to get a 2&#37; subset then you are better off eliminating as much of the data at the start of query execution plan rather than initially doing LEFT joins with the INNER join and WHERE criteria being applied to the end. You will only waste effort/resources (CPU cycles, temp storage, memory, etc) on comparisons/data that will be thrown away or not included in the final result so do the INNER join as early as possible.

    Same goes if it was a manual paper process.... if you were asked to retrieve related documents from 5 piles then you will eliminate as much of them first (best if they are sorted or indexed) or extract only relevant documents from the piles. And finally you staple together the remaining documents that are relevant to your task; least amount of effort required on your part. You will avoid checking each and every document as much as possible.

    But it also depends on how data is to be returned, or the columns to be returned.... sometimes you have no choice but to do a LEFT join to show that data from right-hand side of relationship does not exist (columns from right hand side needed) or for aggregation purposes (more common in data warehouse).
    Last edited by leinad31; Dec 28th, 2009 at 07:28 PM.

  8. #8
    PowerPoster abhijit's Avatar
    Join Date
    Jun 1999
    Location
    Chit Chat Forum.
    Posts
    3,228

    Re: LEFT JOIN versus INNER JOIN

    Pradeep,
    I think your exercise should have checked the execution plan instead of the timing of the query results.

    The execution plan will give you a more accurate picture of which operation will result in maximum cost.
    Everything that has a computer in will fail. Everything in your life, from a watch to a car to, you know, a radio, to an iPhone, it will fail if it has a computer in it. They should kill the people who made those things.- 'Woz'
    save a blobFileStreamDataTable To Text Filemy blog

  9. #9

    Thread Starter
    VB Addict Pradeep1210's Avatar
    Join Date
    Apr 2004
    Location
    Inside the CPU...
    Posts
    6,614

    Re: LEFT JOIN versus INNER JOIN

    @tg, leinad
    Yes, inner joins are definitely faster. But there comes many situations in our programs where the backend database supports null fields but is not allowed from the front end application. So in such cases, a minor bug may also cause a null value to be inserted there causing inner join to eleminate the rows in the output result where as the outer join seems like a safe bet. But I'm looking at performance implications in such cases.

    @abhijit
    I had compared the execution plans but forgot to mention there. The execution plan for both cases is exactly same. They start varying only when the SELECT clause is altered. So it seems like sql server does some smart manipulations and finds out that both would produce exactly same results. This is also evident from the result of experiment. They match neck to neck.
    Pradeep, Microsoft MVP (Visual Basic)
    Please appreciate posts that have helped you by clicking icon on the left of the post.
    "A problem well stated is a problem half solved." — Charles F. Kettering

    Read articles on My Blog101 LINQ SamplesJSON ValidatorXML Schema Validator"How Do I" videos on MSDNVB.NET and C# ComparisonGood Coding PracticesVBForums Reputation SaverString EnumSuper Simple Tetris Game


    (2010-2013)
    NB: I do not answer coding questions via PM. If you want my help, then make a post and PM me it's link. If I can help, trust me I will...

  10. #10
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,687

    Re: LEFT JOIN versus INNER JOIN

    well, if the inner join eliminates data that you require, then the inner join isn't an option. If that's the case, then you need the outer join. Which is what I was trying to point out (and I'll admit that I'm not entirely sure that's how it came out.)

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  11. #11
    PowerPoster
    Join Date
    Nov 2002
    Location
    Manila
    Posts
    7,629

    Re: LEFT JOIN versus INNER JOIN

    Quote Originally Posted by Pradeep1210 View Post
    @tg, leinad
    Yes, inner joins are definitely faster. But there comes many situations in our programs where the backend database supports null fields but is not allowed from the front end application. So in such cases, a minor bug may also cause a null value to be inserted there causing inner join to eleminate the rows in the output result where as the outer join seems like a safe bet. But I'm looking at performance implications in such cases.

    @abhijit
    I had compared the execution plans but forgot to mention there. The execution plan for both cases is exactly same. They start varying only when the SELECT clause is altered. So it seems like sql server does some smart manipulations and finds out that both would produce exactly same results. This is also evident from the result of experiment. They match neck to neck.
    Yes, that is often the case/problem with legacy databases... ideal solution would be to correct the data architecture but no one is willing to risk their career and touch the database due to undocumented dependencies It seems you have no choice but to address the shortcomings of the database design via SQL and accept the performance penalties whatever they may be.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width