Results 1 to 14 of 14

Thread: MSSQL update table with million of rows.

  1. #1

    Thread Starter
    King of sapila
    Join Date
    Oct 2006
    Location
    Greece
    Posts
    6,763

    MSSQL update table with million of rows.

    Hi.
    MSSQL.
    I need to update a table with million of rows.
    I was thinking to do a pass on rowbatch at a time, so i found this script on stackoverf.

    Code:
    DECLARE @Rows INT,
            @BatchSize INT; -- keep below 5000 to be safe
        
    SET @BatchSize = 2000;
    
    SET @Rows = @BatchSize; -- initialize just to enter the loop
    
    BEGIN TRY    
      WHILE (@Rows = @BatchSize)
      BEGIN
          UPDATE TOP (@BatchSize) tab
          SET    tab.Value = 'abc1'
          FROM  TableName tab
          WHERE tab.Parameter1 = 'abc'
          AND   tab.Parameter2 = 123
          AND   tab.Value <> 'abc1' COLLATE Latin1_General_100_BIN2;
          -- Use a binary Collation (ending in _BIN2, not _BIN) to make sure
          -- that you don't skip differences that compare the same due to
          -- insensitivity of case, accent, etc, or linguistic equivalence.
    
          SET @Rows = @@ROWCOUNT;
      END;
    END TRY
    BEGIN CATCH
      RAISERROR(stuff);
      RETURN;
    END CATCH;
    Probably skip collate and the update is only one column.
    I will search on maybe disabling the index if any , update and enable but I wanted to ask if this is a viable solution or if there is anything better?

    Edit. Forgot to write that the updated rows would be at most a couple of hundred so I'm not sure if a batch makes a difference or not. Does the batch get ready the rows that will be searched for the update per pass of it just batch the number of rows to be updated to fit the BatchSize and the do the next rows found according to the BatchSize ?


    Thanks.
    Last edited by sapator; Feb 15th, 2024 at 10:27 AM.
    ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
    πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·

  2. #2
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    5,268

    Re: MSSQL update table with million of rows.

    Huh?
    if your table has million of rows, but the update should only affect a few hundreds, why don't you prefilter the basedata?

    use a cte to prefilter, then update tab with an inner join to the cte
    Last edited by Zvoni; Feb 16th, 2024 at 02:19 AM.
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  3. #3

    Thread Starter
    King of sapila
    Join Date
    Oct 2006
    Location
    Greece
    Posts
    6,763

    Re: MSSQL update table with million of rows.

    Ye it's little blur for me.
    Should I add an index to "pre filter" should that impact performance?
    Also what do you mean use a CTE? Do a select on the CTE and then and update on it?
    The above example is viable? Is it more on doing update to actually million of rows rather than a distinct set of found rows?
    ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
    πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·

  4. #4
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,687

    Re: MSSQL update table with million of rows.

    Yeah, it probably works ... I've never used a top in an update like that ... In the past when I've wanted to batch things, I'd put the IDs into a temp table, then use something like "where id in (select top 10000 id from @tempTbl)" and then remove ids from the temp table, or somehow mark them as processed...

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  5. #5
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    6,192

    Re: MSSQL update table with million of rows.

    Update milions or update hundreds? Anyway, the volume does not seem that mich until you start speaking of billions.

    Try running this from a single UPDATE statement and if it does not finish in a couple of seconds give it couple of minutes - you can always cancel it midflight and the RDBMS will rollback the modifications.

  6. #6
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    5,268

    Re: MSSQL update table with million of rows.

    Quote Originally Posted by sapator View Post
    Ye it's little blur for me.
    Should I add an index to "pre filter" should that impact performance?
    Also what do you mean use a CTE? Do a select on the CTE and then and update on it?
    The above example is viable? Is it more on doing update to actually million of rows rather than a distinct set of found rows?
    Something like this (MS-SQL-Syntax not withstanding. This is for SQLite)
    Aircode
    Code:
    WITH
       CTE AS (SELECT ID FROM tab 
                   WHERE tab.Param1='abc' AND tab.Param2=123 AND 
                   tab.Value<>'abc1' /*COLLATE Latin1_General_100_BIN2*/)
    UPDATE tab 
    SET tab.Value='abc1' 
    FROM tab
    INNER JOIN CTE ON CTE.ID=tab.ID
    or tg's way with IN-Operator (i admit, i don't like the IN-Operator used this way)
    Code:
    UPDATE tab
    SET tab.Value='abc1' 
    WHERE tab.ID IN 
      (SELECT ID FROM tab
       WHERE tab.Param1='abc' AND tab.Param2=123 AND 
       tab.Value<>'abc1' /*COLLATE Latin1_General_100_BIN2*/)
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  7. #7

    Thread Starter
    King of sapila
    Join Date
    Oct 2006
    Location
    Greece
    Posts
    6,763

    Re: MSSQL update table with million of rows.

    Thanks.
    Well i still got some thoughts about the initial query, if it actually doing update to actually million of rows rather than a distinct set of found rows, meaning does it worth running at a couple of hundred of update into a million or rows or only when the rows exceed hundreds?
    Also, philosophically speaking, I'm not sure if cancelling an update will rollback the rows on MSSQL, maybe on a being tran? Or maybe it will make it worse as the tran will stay open? But cancelling when running I'm not sure at all that rollbacks.
    ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
    πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·

  8. #8
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    6,192

    Re: MSSQL update table with million of rows.

    Quote Originally Posted by sapator View Post
    Also, philosophically speaking, I'm not sure if cancelling an update will rollback the rows on MSSQL, maybe on a being tran? Or maybe it will make it worse as the tran will stay open? But cancelling when running I'm not sure at all that rollbacks.
    Practically every DML statement in MSSQL is wrapped into its own (implicit) transaction. Conceptually think of an UPDATE/INSERT/DELETE like being a batch of

    BEGIN TRAN

    UPDATE ...

    COMMIT

    For the last commit statement there is even an option SET IMPLICIT TRANSACTIONS which controls if it's executed or skipped. When skipped every DML opens a transaction which stay open until you send explicit COMMIT (or ROLLBACK) which is the default mode of operation in Oracle.

    Anyway, conceptually the mini-transaction on each UPDATE/INSERT/DELETE means that canceling the DML (e.g. pressing stop button in SSMS) rolls all the modifications back. It's not possible to have half the table updated and the rest of the modification getting canceled by mistake.

    Another consequence is that if you have triggers on the base table these are executed within the scope of this mini-transaction and can effectively rollback the modification which happened before them being executed.

    Another consequence is that if you use manual transaction management (BEGIN TRAN/COMMIT) within base table triggers the @@TRANCOUNT will reach 2 as there is a separate mini-transaction happening.

    Another consequence is that if in ADO/ADO.NET you execute BeginTran method on the connection it actually sends SET IMPLICIT TRANSACTIONS ON statement and the transaction opening is delayed until a DML is executed i.e. if after BeginTran you execute only SELECTs no transaction will be opened and as a result your solution will be more scalable at DB level :-))

    cheers,
    </wqw>

  9. #9

    Thread Starter
    King of sapila
    Join Date
    Oct 2006
    Location
    Greece
    Posts
    6,763

    Re: MSSQL update table with million of rows.

    By what you say and read it is true. By from what I saw, maybe not 100% .
    We had a long query running and cancelled and found out that some rows changed. But I'm thinking it might either did not get the command as it was run for a remote server or cancel did actually fail. Also maybe, maybe we closed the GUI as it was stuck but that is a maybe.
    So anyhow, yes you are correct but I have seen at least one fail and that has stuck to my mind since I never rely on cancel on update and insert. There might first be a long scream "OH SHT!" even if I hit cancel. So I try to tran everything and on inserts and updates check 1-2-3-4-5-6-7-8-9 and 10 times before I go on.
    ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
    πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·

  10. #10

    Thread Starter
    King of sapila
    Join Date
    Oct 2006
    Location
    Greece
    Posts
    6,763

    Re: MSSQL update table with million of rows.

    Also now that I think of it.... I also saw that on a delete. It did not rollback.
    ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
    πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·

  11. #11
    PowerPoster wqweto's Avatar
    Join Date
    May 2011
    Location
    Sofia, Bulgaria
    Posts
    6,192

    Re: MSSQL update table with million of rows.

    Quote Originally Posted by sapator View Post
    Also now that I think of it.... I also saw that on a delete. It did not rollback.
    It always does unless the statement is completed before Stop button is clicked/handled.

    What I often find useful is to wrap DML in

    BEGIN TRAN

    UPDATE ...

    SELECT ... -- to check modifications

    ROLLBACK
    --COMMIT

    When I'm satisfied with result I just comment the ROLLBACK out and uncomment the COMMIT. This way even if SSMS crashes or connection is lost without explicit COMMIT the transaction will sooner or later get ROLLED BACK.

    Sooner is when the connection is dropped (SSMS crashes), later is when the server is restarted (MSSQL service crashed) exactly when db's transaction log is recovered.

    cheers,
    </wqw>

  12. #12

    Thread Starter
    King of sapila
    Join Date
    Oct 2006
    Location
    Greece
    Posts
    6,763

    Re: MSSQL update table with million of rows.

    After some incidents in the past I always use Tran.
    I even go further putting begin tran and update on the same line as if I accidentally select a specific portion of code and don't select the begin tran it will update the transaction without rollback,
    ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
    πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·

  13. #13

    Re: MSSQL update table with million of rows.

    Have you ever encountered issues with SQL Server Management Studio (SSMS) not rolling back transactions properly, especially during updates or deletes?

  14. #14
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    40,109

    Re: MSSQL update table with million of rows.

    Is that question related to this thread, or is it a general question about SQL Server?
    My usual boring signature: Nothing

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width