PDA

Click to See Complete Forum and Search --> : [SQL] Elminating duplicate rows from a table


mendhak
Mar 13th, 2006, 03:59 AM
How to remove all duplicate rows from a SQL Server table


DECLARE @iErrorVar int,

@vchFirstname INT,

@iReturnCode int,

@vchAddress1 varchar(100),

@iCount int,

@chCount char(3),

@nvchCommand nvarchar(4000)


DECLARE DelDupe CURSOR FOR

SELECT COUNT(*) AS Amount,

liCompanyId

FROM CompanyDetails

GROUP BY liCompanyId

HAVING COUNT(*) > 1


OPEN DelDupe

FETCH NEXT FROM DelDupe INTO @iCount,

@vchFirstname

WHILE (@@fetch_status = 0)

BEGIN



SELECT @iCount = @iCount - 1

SELECT @chCount = CONVERT(char(3),@iCount)

--Now we can build our dynamic ROWCOUNT and DELETE statement:

SELECT @nvchCommand = N'SET ROWCOUNT ' + @chCount + ' DELETE FROM CompanyDetails WHERE liCompanyId = ' + CONVERT(VARCHAR(20),@vchFirstname)

EXEC sp_executesql @nvchCommand

FETCH NEXT FROM DelDupe INTO @iCount,

@vchFirstName

END

CLOSE DelDupe

DEALLOCATE DelDupe




Substitute 'CompanyDetails' with your table name and liCompanyId with the primary key field of that table.

Shuja Ali
Mar 13th, 2006, 04:36 AM
How about using a simpler version Delete From
TABLE1
Where
PKFIELD Not In (
Select
Min(PKFIELD)
From
Table1
Group By
COLUMN1) Replace TABLE1 with the table name, PKFIELD with the primary Key field and Column1 with the column you are checking.

mendhak
Mar 13th, 2006, 09:51 AM
Won't that end up deleting nothing, since the duplicate rows have the same pkfield anyways? :confused:

techgnome
Mar 13th, 2006, 10:08 AM
Then it's not a true pkfield is it? The pkfield should be enforcing the uniqueness of the row.... that's what makes it the pkey..... or so I thought...

-tg

Shuja Ali
Mar 14th, 2006, 01:32 AM
Won't that end up deleting nothing, since the duplicate rows have the same pkfield anyways? :confused:Then that Primary Key field will not be a true PK Field.

If you have duplicate records with different primary key then this query will delete those records based on the criteria given in the Group By Clause. :)

CVMichael
Jan 24th, 2009, 12:20 AM
This is how I do it:

DELETE t
FROM TABLE1 AS t
INNER JOIN (
SELECT MIN(PK_ID) AS MinPK_ID, MyDupData
FROM TABLE1
GROUP BY MyDupData
HAVING Count(*) > 1
) AS d ON t.MyDupData = d.MyDupData
WHERE d.MinPK_ID <> t.PK_ID

I think a join is much faster than a cursor, or "not in", though I did not test your solutions yet, but from experience I know that cursors are slow, and also "not in"