[RESOLVED] Delete Query Slow and Optimization In SQL Server
Hi All,
I have this delete query called in a C# that will delete records based on DataID. DataID is GUID in the DB. The fact is it is super slow (30 mins-40 mins)
I manage to improve it using batch delete concept. Took only 9 mins. to finish.
Code:
Query.AppendLine(" DECLARE @Rowcount INT ");
Query.AppendLine(" SET @Rowcount = 1");
Query.AppendLine(" WHILE @Rowcount > 0 ");
Query.AppendLine(" BEGIN");
Query.AppendLine(" DELETE Top (20) FROM [dbo].[tblAttachments] WHERE");
Query.AppendLine(String.Format(" DataID Not In({0}) ", DataIDList));
Query.AppendLine(" SET @Rowcount = @@ROWCOUNT");
Query.AppendLine(" END");
Is there a way to improve the 2nd query(batch delete) to lessen the time of delete?
Other Details:
* DB is MSSQL
* Added Index to DataID
* Number of Rows: 2,687
* Expected Rows after delete: 394
* The table has a binary field that stores the attachments. (huge attachments)
* DataIDList is a string builder variable with 68 GUID DataID.
'67A69794-F41B-4DDE-8C87-B09585ED5BBB','2285E0B3-0FD7-46FC-A768-66EB03BCF2F7','6E2FFDDC-9999-4680-B064-A1CE4A81DFB4', and so on....
An obscure body in the SK system. The inhabitants call it Earth
Posts
7,957
Re: Delete Query Slow and Optimization In SQL Server
I'm guessing the problem is that "In" Where clauses are non-sargable. Do you have an index on DataID? If so, check the execution plan and you'll probably see the index is not being used. I'm willing to bet you'll see a full table scan or possibly a full scan on the clustered index.
There would be a few possible solutions to that and selecting the right one will be very situational but I think my first punt would be something like this:-
At the moment you're using a NOT IN so you know the values you wan to keep. Instead, can you get at the values you don't want to keep - the ones you want to delete?
Put those into a temp table with an index on it
Delete from the primary table based on a join:-
Code:
Delete A
From #IDsToDelete D
Join tblAttachments A
on D.DataID = A.DataID
This will be able to use the indexes on both tables and should be fast.
The issue is going to be how to get those ids into a temp table in the first place. First I'd try the simple approach:-
Code:
Select DataID
Into #IDsToDelete
From tblAttachments
Where DataID not in ({0})
I'm not really expecting that to perform well though. If the Where clause didn't perform in your original delete statement I doubt it'll perform well for a Select Into either. so the next thing I'd try is inserting those "Keep" values into a temp, table of their own and using an Except to populate #IDsToDelete. I'm not sure what that will look like in your VB code but the sql syntax you're aiming to replicate is a multiple insert:-
Code:
Insert into #IDsToKeep
Values (1),(2),(3)…
This syntax lets you insert several rows from a set of values as a single statement.
You then populate #IdsToDelete based on an Except:-
Code:
Select DataID
Into #IDsToDelete
From tblAttachments
Except
Select DataID
From #IDsToKeep
I think that'll probably be the fastest way of achieving it.
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
Re: [RESOLVED] Delete Query Slow and Optimization In SQL Server
I would have beat FD to the punch here - but the forum required me to break my post into individual Lego blocks and post them as pieces - forcing you to build the Millennial Falcon from memory - argh!
Don't step on a block in the dark - ouch!
*** Read the sticky in the DB forum about how to get your question answered quickly!! ***
Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".