Programming Blog: MSSQL Remove duplicate Rows

Wednesday, 5 August 2020

MSSQL Remove duplicate Rows

Select the duplicate key values into a holding table. For example:

    SELECT col1, col2, col3=count(*)
    INTO holdkey
    FROM t1
    GROUP BY col1, col2
    HAVING count(*) > 1

Select the duplicate rows into a holding table, eliminating duplicates in the process. 
For example:
       SELECT DISTINCT t1.* 
       INTO holddups
       FROM t1, holdkey 
       WHERE t1.col1 = holdkey.col1 
       AND t1.col2 = holdkey.col2

At this point, the holddups table should have unique PKs, however, this will not be the case if t1 had        
duplicate PKs, yet unique rows (as in the SSN example above). Verify that each key in holddups is unique,
and that you do not have duplicate keys, yet unique rows. If so, you must stop here and reconcile 
which of the rows you wish to keep for a given duplicate key value. For example, the query:
       SELECT col1, col2, count(*)
       FROM holddups
       GROUP BY col1, col2
should return a count of 1 for each row. If yes, proceed to step 5 below. 
If no, you have duplicate keys, yet unique rows, and need to decide which 
rows to save. This will usually entail either discarding a row, or creating
a new unique key value for this row. Take one of these two steps for each 
such duplicate PK in the holddups table.

Delete the duplicate rows from the original table. For example:
     DELETE t1
     FROM t1, holdkey
     WHERE t1.col1 = holdkey.col1
     AND t1.col2 = holdkey.col2

Put the unique rows back in the original table. For example:
     INSERT t1 SELECT * FROM holddups

Another Way:

delete from table1

WHERE id IN (

SELECT t1.id

FROM table1 t1, table1 t2

WHERE t1.name = t2.name AND t1.id > t2.id

)

Programming Blog

Wednesday, 5 August 2020

MSSQL Remove duplicate Rows

No comments:

Post a Comment

How to add a right-click context menu option in Windows Explorer

Popular Posts

Translate