Complex Formula to Find Doubles

S

Spike9458

Hi All,

I have been working with Excel 2003 and Chip Pearsons website to get some
of the formula help I need. I got a lot done I think, but here is where I am
at. I have two spreadsheets with contact data in them. One list is heavily
populated with email addresses, the other is not. I need to be able to
combine the two lists into one, and get rid of the duplicate entries where
each has the same email address. In the instance where a contact in one list
has a mailing address and an email address, and only mailing address in the
other, I need to be able to get rid of the one with no email address.

For instance:

Sheet1
ColA Column B
Column C
1 Joe Smith 123 Barnyard Rd Anytown, ST, 12345 (e-mail address removed)
2 Jane Doe
3 Tom Jones 444 Anystreet, Anytown, ST, 12345 (e-mail address removed)

Sheet2
1 Joe Smith 123 Barnyard Rd Anytown, ST, 12345 (e-mail address removed)
2 Jane Doe 234 Main St Anytown, ST, 12345 (e-mail address removed)
3 Tom Jones 444 Anystreet, Anytown, ST, 12345

Note that in Sheet1, Jane Doe has not street address or email address, and
in Sheet 2 Tom Jones has no email address.



After combining the two sheets into one - Sheet3 below, how do I weed out
the Jane Doe from row2, and the Tom Jones from row6?

Sheet3
ColA Column B
Column C
1 Joe Smith 123 Barnyard Rd Anytown, ST, 12345 (e-mail address removed)
2 Jane Doe
3 Tom Jones 444 Anystreet, Anytown, ST, 12345 (e-mail address removed)
4 Joe Smith 123 Barnyard Rd Anytown, ST, 12345 (e-mail address removed)
5 Jane Doe 234 Main St Anytown, ST, 12345 (e-mail address removed)
6 Tom Jones 444 Anystreet, Anytown, ST, 12345

One of my spreadsheets has 6500 entries, and the other has 4400, so you can
understand the need to somewhat automate this.

Thanks in advance,

--Jim
 
M

Max

Try this on a *spare* copy of your file ..

In Sheet3,

Insert a new header row

Put in D2: =AND(A2<>"",OR(B2="",C2=""))
Copy D2 down

Then do a Data > Filter > Autofilter on col D,
select TRUE from the droplist in D1
This will filter out the lines you want to get rid of

Select all the filtered lines, do a right-click > Delete
 
S

Spike9458

Hi Max,

Thanks, I will try that. There is one more test I need to apply to the
list.

Per my last post, I should have realized before and apologize for the
inconvenience, but the two spreadsheets contain data that is about 6 months
apart in age. There is a column with expiration dates of each contacts
license, and I need to keep the more recent contact information if there is
a duplicate. So the additional column would be Column D (there are a lot of
other columns with various other data, but these are the key columns.

Sheet3
ColA Column B
Column C Column D
1 Joe Smith 123 Barnyard Rd Anytown, ST, 12345 (e-mail address removed)
2007-12-06
2 Jane Doe
2005-08-26
3 Tom Jones 444 Anystreet, Anytown, ST, 12345 (e-mail address removed)
2007-04-20
4 Joe Smith 123 Barnyard Rd Anytown, ST, 12345 (e-mail address removed)
2005-12-06
5 Jane Doe 234 Main St Anytown, ST, 12345 (e-mail address removed)
2007-08-26
6 Tom Jones 444 Anystreet, Anytown, ST, 12345
2005-04-20

My thoughts are to create a work column, and concatenate the name with the
email address to create unique entries. Jane Doe would be in the list once
with and once without the email address. Then copy the values to another
column, and delete the first work column. Then do a sort of the entire sheet
(about 10800 rows) with the first criteria being the column with names and
emails combined, secondary would be the dates. Not sure if it would work, or
if a formula in a filter or something might be more effective.

Thank-you for your thoughts and ideas.

--Jim


: After line:
: > Select all the filtered lines, do a right-click > Delete
:
: Remove the autofilter
: --
: Max
: Singapore
: http://savefile.com/projects/236895
: xdemechanik
: ---
:
:
 
M

Max

Tinker with this ..

Sample construct at:
http://cjoint.com/?bEdUHoeDob
Spike9458_wks_1.xls

In a copy of Sheet3,

Assume source data in cols A to D, from row1 down, viz.:
Name, Add, Email, Contract Date

Put
In K1: =IF(C1="","",D1-ROW()/10^10)
In L1: =INDEX(A:A,MATCH(LARGE($K:$K,ROW()),$K:$K,0))
Copy L1 to O1

In P1: =IF(ISERROR(L1),"",IF(COUNTIF($L$1:L1,L1)>1,"",ROW()))
In Q1:
=IF(ISERROR(SMALL($P:$P,ROW())),"",
INDEX(L:L,MATCH(SMALL($P:$P,ROW()),$P:$P,0)))
Copy Q1 to T1

Select K1:T1, fill down till the last row of source data
Format col T as dates to taste

Cols Q to T would return the desired results,
i.e. only the most recent unique names with email ids,
all neatly bunched at the top.

If necessary, select cols K to T,
do an "in-place" : Copy > Paste special > Values > OK
to kill the formulas / freeze the values
 
M

Max

Select K1:T1, fill down till the last row of source data

As a precaution, and for improved performance, we could set the calc mode to
Manual before proceeding with the above fill down. Click Tools > Options >
Calculation tab, check "Manual" > OK. Then do the fill. Press F9 to
re-calc, and when calculations are complete, kill all the formulas, then
re-set the calc mode back to "Automatic".
 
S

Spike9458

Hi Max,

WOW, this is great! I will post back once I get it into my spreadsheet.

--Jim

:> Select K1:T1, fill down till the last row of source data
:
: As a precaution, and for improved performance, we could set the calc mode
to
: Manual before proceeding with the above fill down. Click Tools > Options: Calculation tab, check "Manual" > OK. Then do the fill. Press F9 to
: re-calc, and when calculations are complete, kill all the formulas, then
: re-set the calc mode back to "Automatic".
: --
: Max
: Singapore
: http://savefile.com/projects/236895
: xdemechanik
: ---
:
:
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top