Count unique records?

K

Ken Johnson

Hi AntBlabby,
I agree, but there are times when you have to use fancy functions,
namely when the end user is not an Excel expert and would not be able
to use the advanced filter.
Ken Johnson
 
J

JE McGimpsey

I don't have time to do fancy functions, if I don't have to. The way
I'd do it is use the Advance Filter. I'd highlight the column, go to
Advance Filter, check the box for unique records, and have the results
go to some other column. Then, I'd highlight the results and the
number of rows is the number of unique entries. Not too elegant, but
it works really fast.

Unfortunately, it also needs to be repeated any time the data changes.

And if you wanted to use that number in subsequent formulae, you'd need
to add a COUNT() function...

But it's a fine method for a one-off...
 
J

Jason O

Hi Jason,
That formula never fails to stir the possoms.
I retired for the evening straight after posting my reply and missed
all the action.
JE was kept pretty busy while I was snoozing!
It's good to get JE's opinions on the workings of that formula. I'm
sure this thread will remain bookmarked on my computer for quite some
time.

I first saw the formula last Sept when Bob Phillips replied to a
similar request.
My immediate reaction was disbelief and confusion so I started a new
thread
"Can anybody explain to me how Bob Phillips' unique record counter
work?"
in microsoft.public.excel.programming

To save you going to the trouble of doing a search I've pasted Bob's
responses below.
Bob's website, xldynamic, has heaps of other SUMPRODUCT examples worth
looking at.

Ken,
Spookily, Richard Buttrey asked the same question this morning.
Here is my reply re-posted

Let's start by defining the range A1:A20 to talk specifics.
Bob,John,Bob,Bob,John,John,Bob,Bill,Bill,Max
or data in just A1:A10
The basic formula to count unique items is
=SUMPRODUCT(1/COUNTIF($A$1:$A$10,$A$1:$A$10))
The COUNTIF($A$1:$A$10,$A$1:$A$10) part of the formula builds an array
of
the number of occurrences of each item, in this
case{4;3;4;4;3;3;4;2;2;1}.
As can be seen, each occurrence of the repeated value is counted, so
there
are four occurrences of Bob in the array. There will always be the same

number of occurrences of value as the count of that value, unless two
or
more items are repeated the same number of times, in which case it will
be
some multiple of that count.
Thus the item that is repeated 4 times has 4 instances of that count,
dividing 1 by the count of 4, gives 0.25 4 times. The full array of
values
is
{0.25;0.333333333333333;0.25;0.25;0.333333333333333;0.333333333333333;0.25;0

.5;0.5;1}.

The item that repeats 4 times sums to 1. The item that repeats 3 times
also
sums to 1. It should be clear from this that every value works in the
same
way and sums to 1. In other words, 1 is returned for every unique item.
The
sum of these values becomes the count of unique items.
As our test range is A1:A20, and some of the items in A1:A20 are blank,

extending this formula to A1:A20 would return a #DIV/0! Error.
The reason for the error is blank cells in the full range A1:A20. Each
blank
cell returns a 0 value from the COUNTIF formula, which gives the
#DIV/0!
Error when divided into 1.
The solution to this is to force it to count the empty cells as well,
and
not return a zero. Adding &"" to the end of the COUNTIF formula forces
a
count of the blanks.
This addition on its own removes the #DIV/0! error, but will cause the
blanks to be counted as a unique item. A further addition to the
formula
resolves this by testing for those blanks. Instead of dividing the
array of
counts into 1 each time, adding the test creates an array of TRUE/FALSE

values to be divided by the equivalent element in the counts array.
Each
blank will resolve to FALSE in the dividend array, and the count of the

blanks in the divisor array. The result of this will be 0, so the
blanks do
not get counted.
--
HTH
Bob Phillips

(Bob,
Thanks for that, it's going to take me a while to work through it. It's
easily the most amazing bit of excel wizardry I've ever seen. Is it
your own invention?
Ken Johnson)


I wish I could lay some claim to that but I cannot. It is doubtful that
anyone can claim these things absolutely as so many contribute to their
evolution. I first saw it in a more simplistic form

=SUMPRODUCT(1/COUNTIF(A1:A20,A1:A20)

and it has evolved over the years to what I showed that day.

There are better ones out there IMO, although that does have the appeal
of
brevity. I saw one from Domenic a few weeks back that sums in
non-contiguous
cells that I really liked

=SUMPRODUCT(--ISNUMBER(MATCH(COLUMN(C1:N1)-COLUMN(C1)+1,{1,3,6,8,12},0)),--(
C1:N1>=1))

This sums C1,E1,H1,J1,N1 only. I had a solution to this problem, but
not as
elegant.

Here is one of mine for extracting the numbers from a delimited string
and
sum them. So if A1 contains 1,12,123 it will get the 1 and 123 out and
sum
them

=SUMPRODUCT(--MID(A1,FIND("~",SUBSTITUTE(B1&A1&B1,B1,"~",ROW(INDIRECT("1:"&L
EN(A1)-LEN(SUBSTITUTE(A1,B1,""))+1)))),
FIND("~",SUBSTITUTE(B1&A1&B1,B1,"~",ROW(INDIRECT("2:"&LEN(A1)-LEN(SUBSTITUTE
(A1,B1,""))+2))))-
FIND("~",SUBSTITUTE(B1&A1&B1,B1,"~",ROW(INDIRECT("1:"&LEN(A1)-LEN(SUBSTITUTE
(A1,B1,""))+1))))-1))

If you think that all this is useful, take a look at
http://www.xldynamic.com/source/xld.SUMPRODUCT.html

--
HTH

Bob Phillips



That should keep you busy for a while Jason.

Ken Johnson

Ken,

Many thanks for that - I'm gonna sit down with a hard copy today sometime
and get my head around it. When it was first posted I just sat looking at
it, bemused :) It also sheds some light on one of Domenic's previous posted
formulas, which was to count the no. of unique records across two columns:

=SUMPRODUCT(--($A$2:$A$100=D2),--($B$2:$B$100=E2))

Say column A contains car manufacturer, and column B car model. This counts
the number of instances of the combination of manufacturer & model specified
in D2 and E2. (Although the -- are puzzling me:)

Thanks again!

Kind regards

Jay
___
 
K

Ken Johnson

Hi Jason,

I'm slowly making greater use of SUMPRODUCT but I think it'll be a long
time before I become confident about using it. I see it used to solve
so many problems I get the feeling it's important to keep on trying.
The double unary or -- turns Trues to 1s and Falses to 0s. Without them
the formula just returns 0.

Ken Johnson
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top