How to delete duplicates?

K

kevs12

Version: 2004 Operating System: Mac OS X 10.6 (Snow Leopard) Processor: Intel I have a Bulk email software that can easily identify duplicates in a file.
So I know this can be done
What I'd like to do -- is this possible? (and maybe all within Excel)

Find duplicate emails and then delete the dupes out?

(of course I wonder which of the two Excel would delete -- maybe it would do that randomally? ) thanks!
 
P

Phillip Jones, C.E.T.

If you have a Column or row set up as emails then do a Sort on the row
or column who's heading is emails choose the whole sheet go to Data >
Sort then choose the heading email. hit sort and it if you don't choose
everything the it will just sort the row or column so named and leave
the rest positioned as they are.
 
K

kevs12

Thanks Phil,
But I already know how to sort a column.
This post is about identifying quickly the duplicates and then having Excel possible delete the duplicates.
 
J

John McGhie

Hi Kevs:

Excel has this function built-in. Read the instructions carefully: it's a
simple process, but like changing a film in a camera, you have to do it
exactly right or it doesn't work.

There are two ways to do it: this is the simplest one:

http://office.microsoft.com/en-us/excel/HA010346261033.aspx

I prefer this one:
http://support.microsoft.com/kb/262277

Hope this helps


Version: 2004 Operating System: Mac OS X 10.6 (Snow Leopard) Processor: Intel
I have a Bulk email software that can easily identify duplicates in a file.
So I know this can be done
What I'd like to do -- is this possible? (and maybe all within Excel)

Find duplicate emails and then delete the dupes out?

(of course I wonder which of the two Excel would delete -- maybe it would do
that randomally? ) thanks!

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410 | mailto:[email protected]
 
K

kevs12

Thanks John!
Why do you prefer the 2nd one. If looks more convoluted, having to select range of records -- which I actually not sure I know how to do.
 
K

kevs12

On second issue, it could be I forget to click "email me when replies are posted". If you forget, it there a way to check that without having to post again? thanks
 
K

kevs12

John,
I'm trying you seceond one and it's working good. thanks!
Big problem though.
If I have a column which has blank fields (which many of mine do), then the range thing does not seem to work, you get, "the extract range has a missing or illegal field name"
-----
For instance most of my workbooks have a column called notes.
Some cells say, "spoke to Fred, call back on 4/10", and many other cells are empty. What to do? I can only get rid of dupes if if and every cell in the workbook is filled? thanks!
 
K

kevs12

Also, John, the current solution is good, but it blends everything.
what if I have list #1, then get list #2. But I want to keep them as seperate list, but the new one #2 I just got -- I just want to extract from it whatever is in list #1?
 
J

John McGhie

Because of your second question, which is "How do you prepare two lists, the
second of which includes some of the first".

If you filter the list in place, then you have to copy and paste it
somewhere else.

If you filter the list to a different location, you can then filter it
again, with different criteria, and place it somewhere else.

Sorry, I have no idea how to work with the web server version of this forum,
I don't use it.

Cheers

Thanks John!
Why do you prefer the 2nd one. If looks more convoluted, having to select
range of records -- which I actually not sure I know how to do.

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410 | mailto:[email protected]
 
K

kevs12

John,
thanks
1) not clear on how to filter a list, without combining it with the list you are comparing it to. You link was great, but that procedure takes everything, combines all good emails and spits it out as one list. How do you take a new list, cross reference the duplicates with a previously bought list, get those dupes out of the new list that were in the old list, yet leave the new list intact without bringing in non duplicates from the old list.
2) problem of selecting range: Is seems if you have a list of 5000 names, and you forgot to get a dozen last names, you cannot select the range! Excel says there is a missing or illegal field. That seems nuts? no? Any ideas on that -- thanks!
 
J

John McGhie

Hi Kevs:

OK, you're going to have to do this the hard way :)

It's a completely different procedure using MATCH.

First construct your two lists, let's put them in columns A and C

Name the first list OldSuspects and the new list UsualSuspects.

Now in the column to the right of OldSuspects you add a forumula and fill
down: =IF(ISNA(MATCH(A1,UsualSuspects,0))," ","DUP")

And in column to the right of UsualSuspects you add a formula and fill
down: =IF(ISNA(MATCH(C3,OldSuspects,0))," ","DUP")

That will add "DUP" to any cell where the email address to the left matches
any address in the other list.

Add a conditional format to set the cell pattern to a bright colour wherever
"DUP" appears.

Hope this helps

John,
thanks
1) not clear on how to filter a list, without combining it with the list you
are comparing it to. You link was great, but that procedure takes everything,
combines all good emails and spits it out as one list. How do you take a new
list, cross reference the duplicates with a previously bought list, get those
dupes out of the new list that were in the old list, yet leave the new list
intact without bringing in non duplicates from the old list.
2) problem of selecting range: Is seems if you have a list of 5000 names, and
you forgot to get a dozen last names, you cannot select the range! Excel says
there is a missing or illegal field. That seems nuts? no? Any ideas on that --
thanks!

This email is my business email -- Please do not email me about forum
matters unless you intend to pay!
 
K

kevs14

John,
I don't think that will work, because my list has first name, last name, and an email.

And what about selecting ranges? I don't get it. I could have a list of a thousands names and emails, but if I'm missing a few first or last names, then excel wigs out and says "missing or illegal fields"and can't select the range??
Am I correct in that for selecting a range every single cell has to have something in it? Such a pity if this is true because in the real world a workbook may have 5-10 columns many of which as missing names or notes etc. And if this is the case one cannot select a range and filter for duplicates.
 
J

John McGhie

Hi Kevs:

You don't think which one will work? I sent you three methods. Please
don't make me do all the work, or I will get sick of the game and go watch
TV :)

They all work, I tested them.

You only care if one email appears in the other list: MATCH will tell you
that. You do not care whether the names appear or not.

You can change your column assignments to fit your data. You only look in
the column that contains the email addresses. :)

With a little more work, you can add the ADDRESS function to the formula to
tell you WHERE the duplicate is in the other list:

=IF(ISNA(MATCH(A1,UsualSuspects,0)),"",ADDRESS(MATCH(A1,UsualSuspects,0),4,F
ALSE,4,))

No, you are not correct: if you use the methods I sent, it doesn't matter
whether a given cell has content or not. However, when you are returning a
result to another function, you need to be aware of the possibility that a
return may be blank and handle it appropriately if it is.

Cheers


John,
I don't think that will work, because my list has first name, last name, and
an email.

And what about selecting ranges? I don't get it. I could have a list of a
thousands names and emails, but if I'm missing a few first or last names, then
excel wigs out and says "missing or illegal fields"and can't select the
range??
Am I correct in that for selecting a range every single cell has to have
something in it? Such a pity if this is true because in the real world a
workbook may have 5-10 columns many of which as missing names or notes etc.
And if this is the case one cannot select a range and filter for duplicates.

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410 | mailto:[email protected]
 
K

kevs14

John, too convoluted, you replies work, but I would need the DVD or an in house tranier to show me, to convoluted for the layman to understand via this forum.
Way too hard to follow.
Going to start a new thread on the range issue which I still don't get.
 
K

kevs14

PS, I did learn the way to filter combining everything, but not how to keep two list seperate. thanks.
 
K

kevs14

Guys, interesting side story:
I asked a photo agency to send me 350 images. (I sent them an excel sheet)
They sent 349.
I could easily generate an excel file of the 349.
Is there a way to do a filtration to find which one the did not send? thanks!
 
J

John McGhie

Yes. Use the MATCH function.

Create a worksheet with your first list in one column, and the second list
in the other.

Let's say that Column A contains the first list that you sent them, and
Column B contains the second list, that they sent you.

In Column C, cell C1, paste the following formula:
=IF(ISNA(MATCH(A1,B:B,0)),"Missing","")

Fill down.

All of column C will be blank except the one beside the photo in Column A
that is not in Column B, which will say "Missing".

Cheers

Start a new Thread so we keep the questions separate.


Guys, interesting side story:
I asked a photo agency to send me 350 images. (I sent them an excel sheet)
They sent 349.
I could easily generate an excel file of the 349.
Is there a way to do a filtration to find which one the did not send? thanks!

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410 | mailto:[email protected]
 
K

kevs14

John, thanks, will try to create new threads.
Tried, did not work, column C -- just opposite,
dragged formula down and from top on down, every cell says missing.
 
J

John McGhie

Well, I have not seen the data, so I can't comment.

You have the file names you sent in Column A? And the file names THEY sent
in Column B IN THE SAME FORMAT?

Check in Column C. The formula should say "=IF(ISNA(MATCH(A1,..." in Cell
C1, =IF(ISNA(MATCH(A2,..." in Cell C2, "=IF(ISNA(MATCH(A3, ..." in Cell C3
etc all the way down. If it is saying "A1" in all of them, you have
"copied" down instead of "fill" down.

The formula is doing a "This is equal to that" match: It searches the whole
of column B for the file name it finds on this row in Column A: it must
match exactly.

It sounds as though the file names do not match character-for-character.

Cheers

John, thanks, will try to create new threads.
Tried, did not work, column C -- just opposite,
dragged formula down and from top on down, every cell says missing.

This email is my business email -- Please do not email me about forum
matters unless you intend to pay!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top