Saturday, April 6, 2013

How can I remove duplicate rows?

How can I remove duplicate rows?

How can I remove duplicate rows?

Some Days ago , was searching for the best way to remove duplicate rows from a fairly large table (i.e. 300,000+ rows).

The rows of course will not be perfect duplicates because of the existence of the RowID identity field. and i found two good solutions.

Lets say a Table

RowID int not null identity(1,1) primary key,
Col1 varchar(20) not null,
Col2 varchar(2048) not null,
Col3 tinyint not null

Solution 1 :

--Assuming no nulls, you GROUP BY the unique columns, and SELECT the MIN (or MAX) 
--RowId as the row to keep. Then, just delete everything that didn't have a row id:

FROM MyTable
   SELECT MIN(RowId) as RowId, Col1, Col2, Col3 
   FROM MyTable 
   GROUP BY Col1, Col2, Col3
) as KeepRows ON
   MyTable.RowId = KeepRows.RowId
   KeepRows.RowId IS NULL

CONVERT(uniqueidentifier, MIN(CONVERT(char(36), MyGuidColumn))) 
--instead of MIN(RowId) if you have a GUID instead of an integer

Solution 2 :

delete from dupes
from MyTable dupes, MyTable fullTable
where dupes.dupField = fullTable.dupField 
  and dupes.secondDupField  = fullTable.secondDupField 
  and dupes.uniqueField > fullTable.uniqueField

Hope you have enjoyed to read this post,
let me know if you have more suitable suggestions.

you might be more interested in MSSQL, database, SQL, TIPS AND TRICSMySQL

Good Day [/ Night] ! Happy Google + ing ! 



Post a Comment

Any Questions or Suggestions ?


Professional & Experienced Freelance Developer From India, Technologist, Software Engineer, internet marketer and Open Sources Developer with experience in Finance, Telecoms and the Media. Contact Me for freelancing projects.

Enter your email address:

Delivered by FeedBurner

PHP Freelancer India - Google+