How can I remove duplicate rows? ~ PHP Freelancer

Saturday, April 6, 2013

How can I remove duplicate rows?

How can I remove duplicate rows?

Some Days ago , was searching for the best way to remove duplicate rows from a fairly large table (i.e. 300,000+ rows).
The rows of course will not be perfect duplicates because of the existence of the RowID identity field. and i found two good solutions.

Lets say a Table

MyTable
-----------
RowID int not null identity(1,1) primary key,
Col1 varchar(20) not null,
Col2 varchar(2048) not null,
Col3 tinyint not null

Solution 1 :

--Assuming no nulls, you GROUP BY the unique columns, and SELECT the MIN (or MAX) 
--RowId as the row to keep. Then, just delete everything that didn't have a row id:


DELETE MyTable 
FROM MyTable
LEFT OUTER JOIN (
   SELECT MIN(RowId) as RowId, Col1, Col2, Col3 
   FROM MyTable 
   GROUP BY Col1, Col2, Col3
) as KeepRows ON
   MyTable.RowId = KeepRows.RowId
WHERE
   KeepRows.RowId IS NULL

--Use
CONVERT(uniqueidentifier, MIN(CONVERT(char(36), MyGuidColumn))) 
--instead of MIN(RowId) if you have a GUID instead of an integer

Solution 2 :

delete from dupes
from MyTable dupes, MyTable fullTable
where dupes.dupField = fullTable.dupField 
  and dupes.secondDupField  = fullTable.secondDupField 
  and dupes.uniqueField > fullTable.uniqueField

Hope you have enjoyed to read this post,
let me know if you have more suitable suggestions.

you might be more interested in MSSQL, database, SQL, TIPS AND TRICS, MySQL

Good Day [/ Night] ! Happy Google + ing !

PHP Freelancer

Saturday, April 6, 2013

How can I remove duplicate rows?

0 comments:

Post a Comment

About Bhavin

Google+

Popular Posts

Recent Posts

Popular

About Me

Blog Archive

Top Stories

Contact Me

Pages

Total Pageviews

Categories Cloud

Recommended Slide

Follow Me @ FB

Get Daily Updates

Translate

About The Author

Follow Me On : Google & FaceBook

Search This Blog

@Stack