|
-
Feb 15th, 2024, 09:39 AM
#1
MSSQL update table with million of rows.
Hi.
MSSQL.
I need to update a table with million of rows.
I was thinking to do a pass on rowbatch at a time, so i found this script on stackoverf.
Code:
DECLARE @Rows INT,
@BatchSize INT; -- keep below 5000 to be safe
SET @BatchSize = 2000;
SET @Rows = @BatchSize; -- initialize just to enter the loop
BEGIN TRY
WHILE (@Rows = @BatchSize)
BEGIN
UPDATE TOP (@BatchSize) tab
SET tab.Value = 'abc1'
FROM TableName tab
WHERE tab.Parameter1 = 'abc'
AND tab.Parameter2 = 123
AND tab.Value <> 'abc1' COLLATE Latin1_General_100_BIN2;
-- Use a binary Collation (ending in _BIN2, not _BIN) to make sure
-- that you don't skip differences that compare the same due to
-- insensitivity of case, accent, etc, or linguistic equivalence.
SET @Rows = @@ROWCOUNT;
END;
END TRY
BEGIN CATCH
RAISERROR(stuff);
RETURN;
END CATCH;
Probably skip collate and the update is only one column.
I will search on maybe disabling the index if any , update and enable but I wanted to ask if this is a viable solution or if there is anything better?
Edit. Forgot to write that the updated rows would be at most a couple of hundred so I'm not sure if a batch makes a difference or not. Does the batch get ready the rows that will be searched for the update per pass of it just batch the number of rows to be updated to fit the BatchSize and the do the next rows found according to the BatchSize ?
Thanks.
Last edited by sapator; Feb 15th, 2024 at 10:27 AM.
ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·
-
Feb 15th, 2024, 02:47 PM
#2
Re: MSSQL update table with million of rows.
Huh?
if your table has million of rows, but the update should only affect a few hundreds, why don't you prefilter the basedata?
use a cte to prefilter, then update tab with an inner join to the cte
Last edited by Zvoni; Feb 16th, 2024 at 02:19 AM.
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
-
Feb 15th, 2024, 04:45 PM
#3
Re: MSSQL update table with million of rows.
Ye it's little blur for me.
Should I add an index to "pre filter" should that impact performance?
Also what do you mean use a CTE? Do a select on the CTE and then and update on it?
The above example is viable? Is it more on doing update to actually million of rows rather than a distinct set of found rows?
ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·
-
Feb 15th, 2024, 04:52 PM
#4
Re: MSSQL update table with million of rows.
Yeah, it probably works ... I've never used a top in an update like that ... In the past when I've wanted to batch things, I'd put the IDs into a temp table, then use something like "where id in (select top 10000 id from @tempTbl)" and then remove ids from the temp table, or somehow mark them as processed...
-tg
-
Feb 15th, 2024, 05:08 PM
#5
Re: MSSQL update table with million of rows.
Update milions or update hundreds? Anyway, the volume does not seem that mich until you start speaking of billions.
Try running this from a single UPDATE statement and if it does not finish in a couple of seconds give it couple of minutes - you can always cancel it midflight and the RDBMS will rollback the modifications.
-
Feb 16th, 2024, 02:24 AM
#6
Re: MSSQL update table with million of rows.
 Originally Posted by sapator
Ye it's little blur for me.
Should I add an index to "pre filter" should that impact performance?
Also what do you mean use a CTE? Do a select on the CTE and then and update on it?
The above example is viable? Is it more on doing update to actually million of rows rather than a distinct set of found rows?
Something like this (MS-SQL-Syntax not withstanding. This is for SQLite)
Aircode
Code:
WITH
CTE AS (SELECT ID FROM tab
WHERE tab.Param1='abc' AND tab.Param2=123 AND
tab.Value<>'abc1' /*COLLATE Latin1_General_100_BIN2*/)
UPDATE tab
SET tab.Value='abc1'
FROM tab
INNER JOIN CTE ON CTE.ID=tab.ID
or tg's way with IN-Operator (i admit, i don't like the IN-Operator used this way)
Code:
UPDATE tab
SET tab.Value='abc1'
WHERE tab.ID IN
(SELECT ID FROM tab
WHERE tab.Param1='abc' AND tab.Param2=123 AND
tab.Value<>'abc1' /*COLLATE Latin1_General_100_BIN2*/)
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
-
Feb 16th, 2024, 03:14 AM
#7
Re: MSSQL update table with million of rows.
Thanks.
Well i still got some thoughts about the initial query, if it actually doing update to actually million of rows rather than a distinct set of found rows, meaning does it worth running at a couple of hundred of update into a million or rows or only when the rows exceed hundreds?
Also, philosophically speaking, I'm not sure if cancelling an update will rollback the rows on MSSQL, maybe on a being tran? Or maybe it will make it worse as the tran will stay open? But cancelling when running I'm not sure at all that rollbacks.
ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·
-
Feb 16th, 2024, 03:44 AM
#8
Re: MSSQL update table with million of rows.
 Originally Posted by sapator
Also, philosophically speaking, I'm not sure if cancelling an update will rollback the rows on MSSQL, maybe on a being tran? Or maybe it will make it worse as the tran will stay open? But cancelling when running I'm not sure at all that rollbacks.
Practically every DML statement in MSSQL is wrapped into its own (implicit) transaction. Conceptually think of an UPDATE/INSERT/DELETE like being a batch of
BEGIN TRAN
UPDATE ...
COMMIT
For the last commit statement there is even an option SET IMPLICIT TRANSACTIONS which controls if it's executed or skipped. When skipped every DML opens a transaction which stay open until you send explicit COMMIT (or ROLLBACK) which is the default mode of operation in Oracle.
Anyway, conceptually the mini-transaction on each UPDATE/INSERT/DELETE means that canceling the DML (e.g. pressing stop button in SSMS) rolls all the modifications back. It's not possible to have half the table updated and the rest of the modification getting canceled by mistake.
Another consequence is that if you have triggers on the base table these are executed within the scope of this mini-transaction and can effectively rollback the modification which happened before them being executed.
Another consequence is that if you use manual transaction management (BEGIN TRAN/COMMIT) within base table triggers the @@TRANCOUNT will reach 2 as there is a separate mini-transaction happening.
Another consequence is that if in ADO/ADO.NET you execute BeginTran method on the connection it actually sends SET IMPLICIT TRANSACTIONS ON statement and the transaction opening is delayed until a DML is executed i.e. if after BeginTran you execute only SELECTs no transaction will be opened and as a result your solution will be more scalable at DB level :-))
cheers,
</wqw>
Last edited by wqweto; Feb 16th, 2024 at 03:49 AM.
-
Feb 16th, 2024, 04:27 AM
#9
Re: MSSQL update table with million of rows.
By what you say and read it is true. By from what I saw, maybe not 100% .
We had a long query running and cancelled and found out that some rows changed. But I'm thinking it might either did not get the command as it was run for a remote server or cancel did actually fail. Also maybe, maybe we closed the GUI as it was stuck but that is a maybe.
So anyhow, yes you are correct but I have seen at least one fail and that has stuck to my mind since I never rely on cancel on update and insert. There might first be a long scream "OH SHT!" even if I hit cancel. So I try to tran everything and on inserts and updates check 1-2-3-4-5-6-7-8-9 and 10 times before I go on.
ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·
-
Feb 16th, 2024, 04:30 AM
#10
Re: MSSQL update table with million of rows.
Also now that I think of it.... I also saw that on a delete. It did not rollback.
ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·
-
Feb 16th, 2024, 07:56 AM
#11
Re: MSSQL update table with million of rows.
 Originally Posted by sapator
Also now that I think of it.... I also saw that on a delete. It did not rollback.
It always does unless the statement is completed before Stop button is clicked/handled.
What I often find useful is to wrap DML in
BEGIN TRAN
UPDATE ...
SELECT ... -- to check modifications
ROLLBACK
--COMMIT
When I'm satisfied with result I just comment the ROLLBACK out and uncomment the COMMIT. This way even if SSMS crashes or connection is lost without explicit COMMIT the transaction will sooner or later get ROLLED BACK.
Sooner is when the connection is dropped (SSMS crashes), later is when the server is restarted (MSSQL service crashed) exactly when db's transaction log is recovered.
cheers,
</wqw>
-
Feb 16th, 2024, 08:23 AM
#12
Re: MSSQL update table with million of rows.
After some incidents in the past I always use Tran.
I even go further putting begin tran and update on the same line as if I accidentally select a specific portion of code and don't select the begin tran it will update the transaction without rollback,
ἄνδρα μοι ἔννεπε, μοῦσα, πολύτροπον, ὃς μάλα πολλὰ
πλάγχθη, ἐπεὶ Τροίης ἱερὸν πτολίεθρον ἔπερσεν·
-
Mar 6th, 2024, 09:57 AM
#13
New Member
Re: MSSQL update table with million of rows.
Have you ever encountered issues with SQL Server Management Studio (SSMS) not rolling back transactions properly, especially during updates or deletes?
-
Mar 6th, 2024, 11:57 AM
#14
Re: MSSQL update table with million of rows.
Is that question related to this thread, or is it a general question about SQL Server?
My usual boring signature: Nothing
 
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|