Results 1 to 11 of 11

Thread: [RESOLVED] Join versus Where Exists

  1. #1

    Thread Starter
    Fanatic Member InvisibleDuncan's Avatar
    Join Date
    May 2001
    Location
    Eating jam.
    Posts
    819

    Resolved [RESOLVED] Join versus Where Exists

    I want to get data out of a single table, restricted to where an ID field exists on another table. The ID may exist more than once on the second table, so doing a join would cause me to get duplicate values from the first table. However, it's also possible that there would be records on the first table where the data I retrieve is identical, and I would want each of these to be retrieved - so I can't use Distinct.

    I can do it easily enough with this:
    Code:
    Select      A.Field1
              , A.Field2
              , A.Etc
    From        tblAgent                A
    Where Exists (Select 1 from tblTransaction T where T.FirmID = A.ParentFirmID)
    ... but I worry that this would be very inefficient.

    Is there a neater way to do it? A particular type of join, perhaps?

    Thanks...
    Indecisiveness is the key to flexibility.

    www.mangojacks.com

  2. #2
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: Join versus Where Exists

    Exists is perfect for this need.

    You can SELECT * in this case as the WHERE part of an EXISTS is all that is ever evaluated. Don't go thinking that SELECT 1 is faster!

    If this was a stored procedure and if SPEED was a factor, you could build a "distinct list of PARENTFIRMID's" into a TEMP TABLE and then SELECT and JOIN from that.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  3. #3
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,687

    Re: Join versus Where Exists

    What DBMS are you using?

    I'd probably start by getting the DISTINCT IDs from T first... then either use an inner join to the sub-query or use in an IN clause ... IF SQL Server 2008 or newer, I'd poss the DISTINCT ID into a CTE (Common Table Expression) then join to it. Although if it's this simple, the CTE is probably overkill.

    Code:
    select
      A.Field1, A.Field2, A.Etc
    from tblAgent A
    inner join (select distinct FirmID from tblTransaction T)
      on A.ParentFirmID = T.FirmID
    Code:
    select
      A.Field1, A.Field2, A.Etc
    from tblAgent A
    where A.FirmID in (select distinct FirmID from tblTransaction T)
    Code:
    ; with T as (
        select distinct FirmID from tblTransaction
    )
    select
      A.Field1, A.Field2, A.Etc
    from tblAgent A
    inner join T
      on A.ParentFirmID = T.FirmID
    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  4. #4

    Thread Starter
    Fanatic Member InvisibleDuncan's Avatar
    Join Date
    May 2001
    Location
    Eating jam.
    Posts
    819

    Re: Join versus Where Exists

    szlamany - Our coding standards say to never use SELECT *, so I tend to stick with that even if it doesn't matter.

    It's SQL 2008 R2, and it's to be used for a view that will only be used as part of an SSIS package. Speed may be a factor, so I think I might go for the CTE. I just wasn't sure whether that (or a temp table) would necessarily be faster, but presumably the IF EXISTS would run for every record...?
    Indecisiveness is the key to flexibility.

    www.mangojacks.com

  5. #5
    Super Moderator FunkyDexter's Avatar
    Join Date
    Apr 2005
    Location
    An obscure body in the SK system. The inhabitants call it Earth
    Posts
    7,957

    Re: Join versus Where Exists

    Our coding standards say to never use SELECT *
    This is probably one of the rare cases you should break the coding standards. It's a good standard if the * is actually going to be evaulated but in the case it isn't so the standard is non-sensical. I don't think it will actually make any difference because I don't believe your 1 will get evaluated either but I'm not 100% sure on that. At best your select 1 will perform the same as Select * but at worst it will be worse.
    The best argument against democracy is a five minute conversation with the average voter - Winston Churchill

    Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd

  6. #6
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: Join versus Where Exists

    The fact it might run for every record scared me - thus the suggestion of a TEMP TABLE to join from instead.

    You can always check out the actual execution plan - see how that addresses the EXISTS.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  7. #7
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,687

    Re: Join versus Where Exists

    My personal experience has taught me:
    CTEs tend to be faster than sub queries
    Sometimes a temp table or a table variable can be faster than a CTE... I havent' found what the tipping point is, but it probably has to do with the number of CTEs (I had a bout a dozen of them) and their complexity, and the fact that some of the CTEs fed into other CTEs which fed into... you get the idea.
    Above all, starting off with minimal data then joining to other tables, is the most effective performance enhancer I've seen with SQL. The natural instinct is to start with everything, then whittle it down... but if you can invert that logic, you can squeeze out some interesting time savings.

    Code:
    ; with T as (
        select distinct FirmID from tblTransaction
    )
    select
      A.Field1, A.Field2, A.Etc
    from T
    inner join tblAgent A
      on A.ParentFirmID = T.FirmID
    Since T has the least amount of rows/data in it, when you then join to tblAgent, assuming ParentFirmID is indexed, you can get some huge performance gains since it will only then include the rows that actually match. if FirmID in tblTransaction is indexed, then your CTE should result in just an index scan, speeding things up a tad more, and if it's clustered... it doesn't get much more better than that.... AH... actually there is one more tweak, I'm a little surprised szlamany didn't mention it - change the DISCTING to a GROUP BY:
    Code:
    ; with T as (
        select FirmID from tblTransaction gorup by FirmID
    )
    select
      A.Field1, A.Field2, A.Etc
    from T
    inner join tblAgent A
      on A.ParentFirmID = T.FirmID
    The net effect is the same, how it arrives at it will be different. I think at that point you will have squeezed as much performance out of that as you can.


    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  8. #8

    Thread Starter
    Fanatic Member InvisibleDuncan's Avatar
    Join Date
    May 2001
    Location
    Eating jam.
    Posts
    819

    Re: Join versus Where Exists

    Guys - thank you. The execution plan was fairly non-committal, probably because there's no actual data in the Agent table anymore (someone helpfully dropped and recreated it), but it did look as if the EXISTS would have a detrimental effect. Alternating between SELECT * and SELECT 1 didn't seem to make any difference at all.

    I've elected to go with the CTE, and swap the DISTINCT for a GROUP BY as suggested by TG.

    Thanks for your help, everyone - much appreciated.
    Indecisiveness is the key to flexibility.

    www.mangojacks.com

  9. #9

    Thread Starter
    Fanatic Member InvisibleDuncan's Avatar
    Join Date
    May 2001
    Location
    Eating jam.
    Posts
    819

    Re: Join versus Where Exists

    Hmmm - I'm not sure whether I've successfully rated your posts. I just get the message that "you must spread some reputation around before rating this post" or words to that effect. I got the same message for each of you.
    Indecisiveness is the key to flexibility.

    www.mangojacks.com

  10. #10
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: Join versus Where Exists

    Quote Originally Posted by techgnome View Post
    actually there is one more tweak, I'm a little surprised szlamany didn't mention it - change the DISCTING to a GROUP BY:
    I do usually beat up DISTINCT - have for years...

    Current info on the interweb seems to indicate that DISTINCT and GROUP BY now reduces down to the same execution plan. In the old days (SQL 2000-ish) GROUP BY was done during RESULT SET building and DISTINCT was done after RESULT SET building - thus my dislike.

    I've also watched junior programmers of mine use DISTINCT to get around phantom rows created by JOIN's to one-to-many links. That's just poor coding in general!

    btw - spread rep around means you need to give to others - we are blocked due to the fact that you've given to us without doing others as well. Keeps people from repeat rep'ing each other I guess...

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  11. #11

    Thread Starter
    Fanatic Member InvisibleDuncan's Avatar
    Join Date
    May 2001
    Location
    Eating jam.
    Posts
    819

    Re: Join versus Where Exists

    Quote Originally Posted by szlamany View Post
    btw - spread rep around means you need to give to others - we are blocked due to the fact that you've given to us without doing others as well. Keeps people from repeat rep'ing each other I guess...
    That's a little mean if it's always the same bunch of people who help you out.
    Indecisiveness is the key to flexibility.

    www.mangojacks.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width