code to delete the row if the cells on specific column are unique

1k Views Asked by At

What I am trying to achieve is to create a vba code to completely delete the rows if the value in column C (Id) is unique. So in example below the rows 6 and 7 would be deleted since the 111115 and 111116 are not showing up more than once in this column C. Any help is welcome! Thanks a lot.

enter image description here

Code so far: (but not working yet)

Sub delete_not_duplicates()

Dim i As Integer, j As Integer, toDel As Boolean, theNum As Integer
i = 2

Do While Cells(i, 3).Value <> ""
    toDel = True
    theNum = Cells(i, 3).Value
    Do While Cells(j, 3).Value <> ""
        If  i <> j and Cells(j, 3) == theNum Then
            toDel = False
    Loop
    If toDel == true Then
       Rows(i).Delete
    Else
    i = i + 1
    End If
Loop


End Sub
2

There are 2 best solutions below

3
On BEST ANSWER

The general approach to do do this in a reasonable fast way is to

  1. Get your data into a Variant Array
  2. Loop the array, identifying unique values
  3. Build a range reference to rows to be deleted, but defer the deletion
  4. After the loop, delete all rows in one go

Sub demo()
    Dim rDel As Range, rng As Range
    Dim dat As Variant
    Dim i As Long, cnt As Long
    Dim TestCol As Long

    ' Avoid magic numbers
    TestCol = 3 ' Column C

    ' Reference the correct sheet
    With ActiveSheet
        ' Get data range
        Set rng = .Range(.Cells(1, TestCol), .Cells(.Rows.Count, TestCol).End(xlUp))

        ' Get data as a Variant Array to speed things up
        dat = rng.Value

        ' Loop the Variant Array
        For i = 2 To UBound(dat, 1)
            ' Is value unique?
            cnt = Application.CountIfs(rng, dat(i, 1))
            If cnt = 1 Then

                ' If so, add to delete range
                If rDel Is Nothing Then
                    Set rDel = .Cells(i, TestCol)
                Else
                    Set rDel = Union(rDel, .Cells(i, TestCol))
                End If
            End If
        Next
    End With

    ' Do the delete
    If Not rDel Is Nothing Then
        rDel.EntireRow.Delete
    End If
End Sub
1
On

I think the most efficient way would be:

  1. Initialize an empty HashSet< Integer> (or whatever generic type you want) which will represent all the unique entries of C (id), let's name it uniqueIdSet

  2. Iterate through the 2D array



    if(uniqueIdSet.contains(id)){
          //if the id was already seen before, it means it's not unique
          uniqueIdSet.remove(id);
    }
    else{
          //we haven't seen this id yet, add it to the unique set
          uniqueIdSet.add(id);
    }

  1. Iterate through the original array again and do:


    if(uniqueSet.contains(id)){
          //if the id is unique, remove it from the array.
          array.remove(currentRow);
    }

Depending on your implementation, you might not be able to remove from the array as you iterate through it. A way around it is initializing a copy of the original array and remove the respective row from there.