Statistics

Total Posts: 34
This Year: 0
This Month: 0
This Week: 0
Comments: 174


RSS 2.0   SocialTwist Tell-a-Friend


Admin

Sign In

Navigation


Recent Posts


On this page....

How to eliminate duplicate records from a database table?

Archives

 Full Archives By Category
 2007 Calendar View
<December 2008>
SunMonTueWedThuFriSat
30123456
78910111213
14151617181920
21222324252627
28293031123
45678910

Categories

CDOSYS (1) Classic ASP (10) Command Line (2) Databases (16) Excel (1) HTML (1) IIS (10) Indexing Service (1) Internet Explorer (7) Media Streaming (1) MS.Net (2) SQA (7) SQL Server (16) Windows OS (2)

Blogroll - Fav Blogs


Acknowledgments

DasBlog Theme Design by: Tom Watts
E-mail: Send mail to the author(s)
Theme Image by: dreamLogic

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

Technology Blog

Author: Akash Heranjal

Description of the problem:

A database table by name MyTable has 3 data fields MyID, MyName, MyAddress.
• MyID is Identity and primary key (PK) with Int data type
• MyName is a VarChar(200) data type
• MyAddress is TEXT data type

Accidently, the PK and Identity is removed and there are duplicate records in the table. Now we need to introduce the PK back on MyID by cleaning the database table. There are 10,000,000 records in the table and we really do not know which record has got duplicated and how many times. The focus is to eliminate duplicate records from the database table and introduce the PK back on MyID field.

Solutions/Approach:

• Introduce an additional column by name RecID and make it Identity column
• Run the below query
Delete from MyTable Where RecID Not In
(Select Min(RecID) from MyTable Group By MyID)


General Syntax:
Delete from [TableName] Where [NewIdentityCol] Not In
(Select Min([NewIdentityCol]) from [TableName] Group By [ColNameHavingDuplicate])


• Remove the new column RecID
• Introduce Identity and PK back on MyID field