dcsimg
Results 1 to 25 of 25

Thread: [RESOLVED] Year Slowdown

  1. #1

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    32,622

    Resolved [RESOLVED] Year Slowdown

    I have an issue that has me a bit puzzled. I have a very complicated query of a pretty large database. There are some 700,000 records being returned by the query, which may or may not be right. However, this was all driven by a desire to see just one year of the whole query. There is a single date field that is relevant, so I tried this:
    Code:
    WHERE YEAR(theDateField) = 2018
    This timed out. I thought that was pretty odd, since the whole query without the WHERE clause does not time out. However, 2018 is a convenient year, so I changed the WHERE clause to:
    Code:
    WHERE theDateField > '1/1/2018'
    This also timed out. Interestingly, somebody put an index on theDateField at some time in the past, so that doesn't appear to be relevant.

    I then tried testing whether a simpler WHERE clause would cause trouble, so I tried this:
    Code:
    WHERE someOtherIntegerField > 1000
    That was snappy.

    So, the question is: Why is restricting on a date field so horribly slow, and what can I do about it?

    I can think of one solution, but it's quite the hack. I could build up a table that is nothing but the Year and the PK, then join that table into the query such that I wouldn't be seeing the date at all, and would only be seeing the year. I'm pretty sure that would be quick, but kind of horrible.
    My usual boring signature: Nothing

  2. #2
    PowerPoster TysonLPrice's Avatar
    Join Date
    Sep 2002
    Location
    Columbus, Ohio
    Posts
    3,162

    Re: Year Slowdown

    Just guessing but the first example requires a function to run. The second requires translating a string to a date. Try casting '1/1/2018' to a date variable first and then select...just curious.
    Please remember next time...elections matter!

  3. #3
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    1,500

    Re: Year Slowdown

    Another approach:
    Since a Date/Time-Stamp is basically a Double (or Integer), have you tried something like

    WHERE theDateField > 1514764800 (UNIX_TIMESTAMP for '1/1/2018')

    or any other way casting theDateField to/from Double/UNIX_TIMESTAMP etc.
    One System to rule them all, One IDE to find them,
    One Code to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    For health reasons i try to avoid reading unformatted Code

  4. #4
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    31,895

    Re: Year Slowdown

    SQL Server, right? Have you looked at the Execution Plan to see what's going on?

    Is it a Date field, or a DateTime field? It shouldn't matter, but I thought I'd ask.


    Also, while not related to the problem, and because I'm feeling a bit pedantic this morning, you do know that this:
    Code:
    WHERE theDateField > '1/1/2018'
    Will actually exclude a day's worth of data, right? Not that it's related to the issue, but the pedantic desire, like the coffee, is strong today for some reason.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  5. #5
    Frenzied Member ChrisE's Avatar
    Join Date
    Jun 2017
    Location
    Frankfurt
    Posts
    1,368

    Re: Year Slowdown

    Hi,

    why not find the first PK in the Year 2018
    and use the PK for the Query ...WHERE PK >201111

    regards
    Chris
    to hunt a species to extinction is not logical !
    since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.

  6. #6

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    32,622

    Re: Year Slowdown

    Yeah, that would exclude a day, but it doesn't matter. The data in question is only collected between roughly March and October, so I can be pretty lazy about the date. There may be data from outside that range, but very little, and certainly none in January.

    I also need to remember to say what database I'm working with. It is SQL Server, I just never remember to mention that. I don't remember whether it's Date or DateTime, but I expect that it is probably the latter. I could look it up, just don't feel like it at the moment.

    @Tyson: I don't really know what SQL Server does with a string literal. Hopefully, the string literal is converted just one time to a date for the purpose of the WHERE clause, in which case it should make no difference whether I converted the string to a date, or let SQL Server do it. I don't know whether or not that is true, I just kind of assumed that it was.

    @zvoni: That's worth a try. I wasn't sure how SQL Server held a date variable internally, or whether or not that wouldn't just create the same cast as the string literal (or at least a pretty similar one).
    My usual boring signature: Nothing

  7. #7
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    31,895

    Re: Year Slowdown

    Yeah, SQL Server will do implicit conversions from string to dateTime w/o issue - as long as it's in a proper format. There isn't a date delimiter like there is in Access with the #, so using a string is pretty much the only way to do it. It's also fairly quick. So that shouldn't be an issue. Also SQL Server does store datTime as a decimal internally... just like most DBMSs.... BUT if I remember right, it uses a different EPOCH than Unix or other similarly based systems... so caution.
    Still curious what the execution plan looks like though.

    It's also possible that it's not even the where clause that's the problem, but something else, it's just that you happened to uncover it with the change in the where.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  8. #8
    Frenzied Member
    Join Date
    Jun 2014
    Posts
    1,072

    Re: Year Slowdown

    Code:
    WHERE theDateField > '1/1/2018'
    Will actually exclude a day's worth of data
    maybe yes, maybe no
    2018-01-01 00:00:01 is later than 2018-01-01
    do not put off till tomorrow what you can put off forever

  9. #9

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    32,622

    Re: Year Slowdown

    I'll look at the execution plan tomorrow, or the next day (why do so many things all hit the table at the same time?). The reason I so strongly suspect the WHERE clause is because the query runs so fast without it, and it runs even faster if the where clause is on some integer field (as it should, since it would be getting fewer records).

    I'm also going to try something that I'd otherwise think is stupid: Create a view that has two fields: The Year and the PK on the main table. I'll then link that view into the view I'm working on. Normally, that should do no good, though possibly no harm. If that were to actually improve the performance, that would be interesting and suggestive. Horrid, too, but interesting.
    My usual boring signature: Nothing

  10. #10
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    1,500
    One System to rule them all, One IDE to find them,
    One Code to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    For health reasons i try to avoid reading unformatted Code

  11. #11
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    CT
    Posts
    17,770

    Re: Year Slowdown

    That index on the date column might be the issue. Indexes on values that are NOT OFTEN different are sometime useless.

    Use a HINT to skip the index - of course while viewing the ACTUAL execution plan.

    https://docs.microsoft.com/en-us/sql...ql-server-2017

    WITH INDEX(0)

    WITH FORCESCAN

    WITH FORCESEEK

    These should allow you see the query run different ways.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  12. #12
    Fanatic Member wqweto's Avatar
    Join Date
    May 2011
    Posts
    808

    Re: Year Slowdown

    Quote Originally Posted by Shaggy Hiker View Post
    So, the question is: Why is restricting on a date field so horribly slow, and what can I do about it?
    When you are selecting more that 2-5% of the rows in the base table db engine will never user a non-clustered index to do the data access as this is going to be more inefficient that scanning the clustered index completely and discarding the rows that don't match the predicate. Non-clustered indexes are rarely covering so the engine has to do something called "key lookup" to get any data from columns not part of the index and this means for each row satisfying search condition to perform a seek in the clustered key to locate the actual base table row.

    There is nothing horribly slow with any data-type (except maybe GUIDs). Your second query is fast as it's not returning 700k rows, has a NC index covering the search condition with high enough selectivity for the db engine to be able to figure out based on statistics that less than X% of the total rows are going to be fetched so the number of key lookups in the clustred index are not going to be slower that a simple complete clustered index scan.

    Using WHERE YEAR(theDateField) = 2018 is truly horrible -- just google for SARGable quieries/expression. Using DATEPART or any other (built-in) function makes the expression opaque for index access optimizations. Replacing YEAR(theDateField) = 2018 with theDateField BETWEEN '20180101' AND '20181231' must be part of Programming SQL Server 101.

    cheers,
    </wqw>

  13. #13
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    CT
    Posts
    17,770

    Re: Year Slowdown

    Quote Originally Posted by wqweto View Post
    ...Replacing YEAR(theDateField) = 2018 with theDateField BETWEEN '20180101' AND '20181231' must be part of Programming SQL Server 101.
    BETWEEN is good stuff - I use it constantly.

    Also - good stuff in the info you posted.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  14. #14
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    31,895

    Re: Year Slowdown

    That's why I was interested in the execution plan... it would have revealed that. I know I've done some similar types of queries with dates and run into similar performance issues.... unfortunately I'm not at that job any more so I didn't have access to the code base where I could have looked up those cases and seen what we did, but I think we probably used BETWEEN as well, since 90% of the time we're running reports or data selections on time slices.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  15. #15
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    1,500

    Re: Year Slowdown

    Not to sound stupid:
    But isn't BETWEEN a shortcut for MyField>=aLowValue AND MyField<=aHighValue?
    Shaggy already did the WHERE theDateField>'1/1/2018' (Notice the String-Literal)
    One System to rule them all, One IDE to find them,
    One Code to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    For health reasons i try to avoid reading unformatted Code

  16. #16
    Super Moderator FunkyDexter's Avatar
    Join Date
    Apr 2005
    Location
    An obscure body in the SK system. The inhabitants call it Earth
    Posts
    6,996

    Re: Year Slowdown

    My guess is that Tyson had it right in post 2:-

    Year(ADate) is non-sargable because it includes a function
    ADate>'01/01/2018' probably doesn't convert 01/01/2018 to a date, it converts ADate to a string so, again, is non-sargable. (It's also risky in the date format it will use - doesn't really matter in this case but implicit date conversions are a bad habit worth getting out of)

    You would probably get better results with:-
    Code:
    Where ADate > Convert(Date, '01/01/2018', 101) --or 103 if you want dd/MM/yyyy
    as this will always be comparing a date to a date.

    Another possibility is that another index is already being favoured due to the rest of your where clause, meaning that the Date element is being filtered by a scan within the results of the rest of the Where.

    E.g. If you have:-
    Code:
    Where Status = 1
    And Date > '01/01/2018'
    and there is already an index on Status, a second index on Date might not be used. Instead it will use the Status Index to do an initial then scan those results to filter by the date. In theory SQLServer should use the more selective index but it's not guaranteed if e.g. the stats are barfed or this is in a sproc with a pre-cached execution plan.

    To resolve this, ensure the stats are up to date and if it's a sproc put add a With Recompile and run it to flush the cache (you can then remove it again). If you're not concerned about space you could then also add an index that contains both fields with the most selective (probably Date) taking priority.

    Other than that it's a case of checking the execution plan and seeing if there are indexes you'd expect to be used but aren't.
    You can depend upon the Americans to do the right thing. But only after they have exhausted every other possibility - Winston Churchill

    Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd

  17. #17

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    32,622

    Re: Year Slowdown

    Well, I'm back at it. Changing the WHERE to this:
    Code:
    WHERE theDateField > CONVERT(Date, '01/01/2018', 101)
    did nothing noticeable. I then removed the index on theDateField, cause I always was suspicious about that. That had no impact, either. Meanwhile, I took a vacation last weekend (plus a couple days on either side) during which every single program I have written in the last 10 years all had problems at the same time. Now THAT'S awesome!! A few of them have been tracked down to services I was calling, but didn't write. Another wasn't a bug, just somebody forgot a step, and the rest are not yet classified. There's one program left (written 11 years ago), and my last decade will have crashed in one vacation. How cool is that?

    On to the next thing.
    My usual boring signature: Nothing

  18. #18

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    32,622

    Re: Year Slowdown

    Doggone it. I didn't check things well enough.

    I killed off the index on the date field in question. I didn't think I really cared what it was, because the name suggested that it was just an index on the date field. If I had created the index, that's all it would have been. Performance didn't improve when I removed the index, which didn't surprise me all that much. I then figured I'd put the index back and start looking into the execution plan. Putting the index back fixed the issue. The query was blazingly fast with that index put back in. Now I'm wondering what that index really was.
    My usual boring signature: Nothing

  19. #19
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    31,895

    Re: Year Slowdown

    It's possible the index was outdated and simply needed to be updated. Dropping it and re-creating it re-built the statistics for it, resulting in the improved performance.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  20. #20

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    32,622

    Re: Year Slowdown

    That must be the case. I went to a backup to see what the old index actually was. Turns out, it was identical to the new one, so it must have just been pretty well out of date.
    My usual boring signature: Nothing

  21. #21
    Fanatic Member wqweto's Avatar
    Join Date
    May 2011
    Posts
    808

    Re: [RESOLVED] Year Slowdown

    On MSSQL it's very rare to actually see a disabled index. Index/column *stats* though become stale on a regular basis and the built-in stats auto-update (and auto-create) has mixed results coping with the problem.

    cheers,
    </wqw>

  22. #22
    Frenzied Member
    Join Date
    Jun 2014
    Posts
    1,072

    Re: [RESOLVED] Year Slowdown

    Quote Originally Posted by wqweto View Post
    theDateField BETWEEN '20180101' AND '20181231'
    may (or not) exclude a day's worth of data
    do not put off till tomorrow what you can put off forever

  23. #23
    Fanatic Member wqweto's Avatar
    Join Date
    May 2011
    Posts
    808

    Re: [RESOLVED] Year Slowdown

    Quote Originally Posted by IkkeEnGij View Post
    may (or not) exclude a day's worth of data
    Fair warning, there might be time portion to 20181231 that gets omitted if using DATETIME data-type instead of the newer DATE data-type that reduces date-only storage from 8 to 3 bytes (huge gain if dates are part of your tables' clustered indexes).

    Proven again, unambiguous DDL is valuable even in most trivial db discussions :-))

    cheers,
    </wqw>

  24. #24
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    CT
    Posts
    17,770

    Re: [RESOLVED] Year Slowdown

    Quote Originally Posted by wqweto View Post
    newer DATE data-type that reduces date-only storage from 8 to 3 bytes (huge gain if dates are part of your tables' clustered indexes).
    If MS SQL still uses 8000 byte blocks for things like index storage, then 3 vs. 8 bytes gets you 2600 vs. 1000 keys in a block. That might seem like a big deal - it's not. The way index LEAVES work you build layers that get you to the final attack point you want in your index - you will only ever reach one LEAF layer. And at that point you already know your key either will or will not exist in this block.

    SIZE of KEY only nominally affects search time. The number of hits on a B-TREE is a constant based on the total number of keys and the number of LAYERS of nodes that are in place.

    I say nominally just due to the fact that you might have to visit more NON-LEAF layer blocks, but that really is because that particular index needs another NON-LEAF layer due to index size - this should be caught by SQL during the life of the index.

    It's been since 1980 that I've been concerned about key sizes in indexes.
    Last edited by szlamany; Oct 27th, 2018 at 07:38 AM.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  25. #25
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    CT
    Posts
    17,770

    Re: [RESOLVED] Year Slowdown

    Quote Originally Posted by IkkeEnGij View Post
    may (or not) exclude a day's worth of data
    It is the job of the coder to make sure that they NEVER, EVER, EVER put a time into a DATETIME field that is used for DATE only purposes.

    That is a requirement of MS SQL for decades now.

    If you have a START DATE or END DATE field - or a BIRTH DATE field - and you accidentally allow your UI to somehow put a time into that field - that is bad on you only.

    And if you have a field that stores time then you need to have in your tool-belt the knowledge of how to write a WHERE clause to get dates BETWEEN "x" and "y".

    Quote Originally Posted by wqweto View Post
    Fair warning, there might be time portion to 20181231 that gets omitted if using DATETIME data-type instead of the newer DATE data-type
    IMO, it is more important to limit the number of data types I utilize in my clients applications. That's probably because of the DBA hat that I wear as advocate for my clients production data. Adding a new data type - like the DATE only one that newer versions of MS SQL support, would affect a whole lot areas in STORED PROCEDURES and how backend code moves parameters from the UI world all the way through binding with the actual SPROC. I've got backend code for reporting that is 20 years old, backend code for creating EXCEL files. Clients still running old versions of MS SQL. Clients still running old VB6 executables. Lots to consider.
    Last edited by szlamany; Oct 27th, 2018 at 07:42 AM.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Featured


Click Here to Expand Forum to Full Width