Wednesday 5 August 2020

MSSQL Remove duplicate Rows

  1. Select the duplicate key values into a holding table. For example:

        SELECT col1, col2, col3=count(*)
        INTO holdkey
        FROM t1
        GROUP BY col1, col2
        HAVING count(*) > 1
  2. Select the duplicate rows into a holding table, eliminating duplicates in the process. 
    For example:
           SELECT DISTINCT t1.* 
           INTO holddups
           FROM t1, holdkey 
           WHERE t1.col1 = holdkey.col1 
           AND t1.col2 = holdkey.col2
  3. At this point, the holddups table should have unique PKs, however, this will not be the case if t1 had        
    duplicate PKs, yet unique rows (as in the SSN example above). Verify that each key in holddups is unique,
    and that you do not have duplicate keys, yet unique rows. If so, you must stop here and reconcile 
    which of the rows you wish to keep for a given duplicate key value. For example, the query:
           SELECT col1, col2, count(*)
           FROM holddups
           GROUP BY col1, col2
    should return a count of 1 for each row. If yes, proceed to step 5 below. 
    If no, you have duplicate keys, yet unique rows, and need to decide which 
    rows to save. This will usually entail either discarding a row, or creating
    a new unique key value for this row. Take one of these two steps for each 
    such duplicate PK in the holddups table.
  4. Delete the duplicate rows from the original table. For example:
         DELETE t1
         FROM t1, holdkey
         WHERE t1.col1 = holdkey.col1
         AND t1.col2 = holdkey.col2  
  5. Put the unique rows back in the original table. For example:
         INSERT t1 SELECT * FROM holddups
Another Way:
delete from table1
FROM table1 t1, table1 t2

No comments:

Post a Comment

How to delete Local_lvm storage and resize local storage to use full disk space?

1) Delete the Local_lvm from othe proxmox interface . "datacenter" > Storage. Select the local_lvm storage and click "Remo...