Duplicate Row Removal Solution

L

lists

I have an issue where I'm trying to remove duplicate rows from a table
(leaving one so that it is unique) and then another issue where I'm
trying to remove all duplicate rows in a table such that there are no
rows containing that data left.

The spread sheet consists of rows like this:

Col 1 Col 2 Col 3 Col 4
Stuff Stuff Something 1
Stuff Stuff Something 1
Stuff Stuff Stuff 0
Stuff Stuff Nada 1
Stuff Stuff Nada 1
Stuff Stuff Nada 1
Stuff Stuff Stuff 0
Stuff Stuff Stew 0

In this case Column 3 is the one that needs testing to see if they are
dups, all other columns are irrelavent. Basically what I was thinking
was, in terms of an algorithm:

Mark all duplicates with 1 whether they are above or below (thus
needing the OR statement):
=(if below == above, 1 OR if above == below, 1) Then delete all rows
with 1 in that Column 4.

The standard if that marks all but last duplicate could be:
"=if(A2=A1, 1, 0)" After this things really start getting fuzzy as
I'm not familiar enough with programmatically working with Excel.

I'm at a loss as to how to do this in Excel or if there would be an
even better way of doing this. I would appreciate it much if anyone
can demonstrate how this would be accomplished.
 
P

Patrick Molloy

method is easy enough. you need to sort by the column
that you test first. maybe you'll need to reset to the
original order following your cleanup.

1) add a column and number each row
2) sort by the column to be testes
3) remove duplicates
4) restore the order

so add a standard code module and copy this:

Option Explicit
Sub Test()
Remove_Dupes 3
End Sub
private Sub Remove_Dupes(testcol As Long)
Dim col As Long
Dim lastrow As Long
Dim thisrow As Long

' get the last column, then
' add the row numbers
col = Range("A1").End(xlToRight).Column + 1
' get the last row
lastrow = Range("A1").End(xlDown).Row

' add a column fro the original row order
With Range(Cells(1, col), Cells(lastrow, col))
.Formula = "=Row()"
.Value = .Value
End With

' sort the table by the test column
With Range(Cells(1, 1), Cells(lastrow, col))

.Sort Cells(1, testcol)
' remove duplicate
For thisrow = lastrow To 2 Step -1

If Cells(thisrow, testcol).Value = _
Cells(thisrow - 1, testcol).Value Then
Rows(thisrow).Delete
End If
Next

'restore whats left to the original order
.Sort Cells(1, col)

End With
End Sub

HTH
Patrick Molloy
Microsoft Excel MVP
 
L

lists

Excellent help. Thank you so much. That works like a charm. Now, I
do apologize for my lack of knowledge in VB and Excel in this vein,
but I would like to carry this a little further. Duplication removal
is perfect, however what would I do if I need to not only remove the
duplicates, but the originals as well such that the original and all
duplicates (of that data column since we are not comparing entire row
contents but just that one cell) of that cell in all rows would be
removed?

Thanks again!!!
 
I

Inabus

Got a question,
I have imported this code myself and it works great however it doesn
do "exactly" what I want.

I have the 3 columns Date, IP and Hostname. Your macro will only remov
a duplicate if the Date, IP and Hostname are all duplicated.

The problem I face, however, is that I want to remove the duplicate ro
based on the Hostname column not all 3.

Is that possible using your code??
 
P

...Patrick

This maybe ...
try on a sample file

Sub remDoublons()
Dim Mcell, cell, i
Dim Lastr
Lastr = Range("C65000").End(xlUp).Row
For i = Lastr To 2 Step -1
Mcell = Cells(i, 3).Value
If Mcell = Cells(i - 1, 3) Then
Cells(i, 3).EntireRow.Delete
End If
Next
End Sub
 
I

Inabus

Found another macro after a bit of searching that does exactly as
want. So therefore ignore this!

Ta,
Pau
 
Top