Finding Duplicate Values

S

Scott Halper

I have the current dataset:

Month Fund Region Name
Jan A E SH
Jan B W FW
Feb C S WR
Feb C E SH
Feb C E SH

I have written the following formula to return the unique values in
the name column:

=SUM(IF(FREQUENCY(IF('Data Sheet'!$A$3:$A
$79=(CONCATENATE(CHOOSE(MONTH(TODAY()),"Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"),"-",YEAR(TODAY()))),IF('Data
Sheet'!$D$3:$D$79="FUND",IF('Data Sheet'!$P$3:$P$79="Region",IF('Data
Sheet'!$J$3:$J$79<>"",MATCH("~"&'Data Sheet'!$J$3:$J$79,'Data Sheet'!$J
$3:$J$79&"",0))))),ROW('Data Sheet'!$J$3:$J$79)-ROW('Data Sheet'!$J
$3)+1),1))

I want to have a formula that will return me the duplicate "Names" per
Month per Fund per Region. Basically its taking the formula above and
instead of it using frequency for unique values it would use some
function for duplicate values.

Thanks for the help.

Scott
 
B

Bernie Deitrick

Scott,

Whatever formula you could possibly come up with would be wickedly complicated. Better to use a
helper column of formulas, and use that with a filter to get the duplicated values. The only
question is whether you want to flag all of the duplicates or just the second or later repeats.

For example, with your example table in A1:D6, this formula in E2, copied to E3:E6:

=IF(SUMPRODUCT(($A$2:$A$6=A2)*($D$2:$D$6=D2))>1, "Duplicate","Unique")

will flag all duplicates, and

=IF(SUMPRODUCT(($A$2:$A2=A2)*($D$2:$D2=D2))>1, "Duplicate","Unique")

will flag only the second and on occurences.

HTH,
Bernie
MS Excel MVP
 
D

Domenic

Let R2 contain...

=CONCATENATE(CHOOSE(MONTH(TODAY()),"Jan","Feb","Mar","Apr","May","Jun","J
ul","Aug","Sep","Oct","Nov","Dec"),"-",YEAR(TODAY()))

Let S2 contain the fund of interest, such as C

Let T2 contain the region of interest, such as E

Then try the following formulas...

U2:

=SUM(IF(FREQUENCY(IF('Data Sheet'!$A$3:$A$79=$R$2,IF('Data
Sheet'!$D$3:$D$79=$S$2,IF('Data Sheet'!$P$3:$P$79=$T$2,IF('Data
Sheet'!$J$3:$J$79<>"",MATCH("~"&'Data Sheet'!$J$3:$J$79,'Data
Sheet'!$J$3:$J$79&"",0))))),ROW('Data Sheet'!$J$3:$J$79)-ROW('Data
Sheet'!$J$3)+1),1))

....confirmed with CONTROL+SHIFT+ENTER

V2, copied down:

=IF(ROWS($V$2:V2)<=$U$2,INDEX($J$3:$J$79,SMALL(IF(FREQUENCY(IF(('Data
Sheet'!$A$3:$A$79=$R$2)*('Data Sheet'!$D$3:$D$79=$S$2)*('Data
Sheet'!$P$3:$P$79=$T$2),IF('Data Sheet'!$J$3:$J$79<>"",MATCH("~"&'Data
Sheet'!$J$3:$J$79,'Data Sheet'!$J$3:$J$79&"",0))),ROW('Data
Sheet'!$J$3:$J$79)-ROW('Data Sheet'!$J$3)+1),ROW('Data
Sheet'!$J$3:$J$79)-ROW('Data Sheet'!$J$3)+1),ROWS($V$2:V2))),"")

....confirmed with CONTROL+SHIFT+ENTER.

Hope this helps!
 
S

Scott Halper

I tried Bernie's solutions, however, I get a column that shows if the
"Name" appears more than once it displays the word Duplicate, however,
I need to now count the amount of Duplicate Names that show up. I
think I can use an array for it, but just not sure how to set it up.

My new dataset:

Month Fund Region Name Occurance
Jan A E SH Unique
Jan B W FW Unique
Feb C S WR Unique
Feb C E SH Duplicate
Feb C E SH Duplicate

The result I'm looking for is for Feb, Fund C, Region E, that was one
duplicate name (SH).
 
B

Bernie Deitrick

Scott,

Change the formula in Cell E2 to

=IF(SUMPRODUCT(($A$2:$A2=A2)*($D$2:$D2=D2))=2, 1,0)

and put a header "Duplicates" into cell E1.

Then select the entire table, use Data / Pivot Table, and drag "Month", "Fund", "Region" to the row
field area, and drag "Duplicates" to the data field area. Set "Duplicates" to Sum, and you will get
a table of counts of duplicate names.

HTH,
Bernie
MS Excel MVP
 
B

Bernie Deitrick

Sorry, I forgot to mention that if you want to count the third and fourth duplicates as well, change
the =2 in the formula to >1

HTH,
Bernie
MS Excel MVP
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top