Mode or frequency

L

Laffin

I have a large data set. Within the column I have determined the mode. How
do I find the second most common number, third most common, etc.?

240
240
240
240
240
240
240
240
288
288
288
300
 
D

Domenic

Assuming that A2:A10 contains your data, here are two options...

[Option 1]

B1:

=MODE(A2:A10)

B2, copied down:

=MODE(IF(COUNTIF(B$2:B2,A$2:A$10)=0,A$2:A$10))

....confirmed with CONTROL+SHIFT+ENTER, not just ENTER.

[Option 2]

This option will generate a Top N list (Top 1, Top 3, Top 5, etc.),
where you choose N, and which will take into consideration ties for Nth
place...

B2, copied down:

=IF(ISNA(MATCH(A2,A$1:A1,0)),COUNTIF(A2:A$10,A2),"")

C2, copied down:

=IF(N(B2),RANK(B2,B$2:B$10)+COUNTIF(B$2:B2,B2)-1,"")

D1: enter 3, indicating you want a Top 3 list

*Enter the Top N list of interest

E1:

=MAX(IF(B2:B10=INDEX(B2:B10,MATCH(D1,C2:C10,0)),C2:C10))-D1

....confirmed with CONTROL+SHIFT+ENTER, not just ENTER.

F2, copied down:

=IF(ROWS(F$2:F2)<=$D$1+$E$1,INDEX(A$2:A$10,MATCH(ROWS(F$2:F2),$C$2:$C$10,
0)),"")

Hope this helps!
 
B

Bernie Deitrick

If your numbers are in column A, with a heading in cell A1, then enter this formula in Cell B2, and
copy down to match column A:

=IF(COUNTIF($A$1:A2,A2)=1,COUNTIF(A:A,A2)+ROW()/100000,"")

Then in C2, enter this formula, and copy down for as many rows as modes you want:

=INDEX(A:A,MATCH(LARGE(B:B,ROW()-1),B:B,FALSE))


HTH,
Bernie
MS Excel MVP
 
R

Ron Rosenfeld

I have a large data set. Within the column I have determined the mode. How
do I find the second most common number, third most common, etc.?

240
240
240
240
240
240
240
240
288
288
288
300

I'm sure someone will come up with a native worksheet function method.
However, I find the functions in Longre's free morefunc.xll add-in (available
from http://xcell05.free.fr/ quite useful, and this is no exception.

The following **array** formula, which makes use of the morefunc UNIQUEVALUES
function, should do what you require.

To enter an **array** formula, hold down <ctrl><shift> while hitting <enter>.
XL will place braces {...} around the formula:

=INDEX(rng,MATCH(LARGE(UNIQUEVALUES(
COUNTIF(rng,rng)),H2),COUNTIF(rng,rng),0))

H2 in the above contains a number which represents the frequency:

1: most common
2: 2nd most common
etc.


--ron
 
H

Harlan Grove

Ron Rosenfeld said:
=INDEX(rng,MATCH(LARGE(UNIQUEVALUES(COUNTIF(rng,rng)),
H2),COUNTIF(rng,rng),0))
....

What about multiple modes? If A1:A6 contained {1;2;1;3;2;3}, MODE(A1:A6)
would return 1, the first mode, but 2 and 3 would also be modes. Your
UNIQUEVALUES call would return {2;"";"";"";"";"";""}.
 
R

Ron Rosenfeld

...

What about multiple modes? If A1:A6 contained {1;2;1;3;2;3}, MODE(A1:A6)
would return 1, the first mode, but 2 and 3 would also be modes. Your
UNIQUEVALUES call would return {2;"";"";"";"";"";""}.


Good point. It would skip some numbers.


--ron
 
R

Ron Rosenfeld

I'm sure someone will come up with a native worksheet function method.
However, I find the functions in Longre's free morefunc.xll add-in (available
from http://xcell05.free.fr/ quite useful, and this is no exception.

The following **array** formula, which makes use of the morefunc UNIQUEVALUES
function, should do what you require.

To enter an **array** formula, hold down <ctrl><shift> while hitting <enter>.
XL will place braces {...} around the formula:

=INDEX(rng,MATCH(LARGE(UNIQUEVALUES(
COUNTIF(rng,rng)),H2),COUNTIF(rng,rng),0))

H2 in the above contains a number which represents the frequency:

1: most common
2: 2nd most common
etc.


--ron

Harlan pointed the error of my method in that it does not account for multiple
entries with the same frequency.

The following, somewhat cumbersome, single cell solution might work.
Unfortunately, I have not figured out how to just enter a single formula and
drag it down as required.

But, again using the same add-in as above (using the VSORT function)

(all of the formulas are **array** entered formulas)

Most frequent:
D1: =VSORT(rng,COUNTIF(rng,rng))

2nd most
D2:
=INDEX(VSORT((rng<>D1)*rng,COUNTIF(rng,rng)),
MATCH(TRUE,0<VSORT((rng<>D1)*rng,COUNTIF(rng,rng)),0))

3rd most
D3:
=INDEX(VSORT((rng<>D1)*(rng<>D2)*rng,
COUNTIF(rng,rng)),MATCH(TRUE,0<VSORT((
rng<>D1)*(rng<>D2)*rng,COUNTIF(rng,rng)),0))

4th most
D4:
=INDEX(VSORT((rng<>D1)*(rng<>D2)*(rng<>D3)*rng,
COUNTIF(rng,rng)),MATCH(TRUE,0<VSORT((
rng<>D1)*(rng<>D2)*(rng<>D3)*rng,COUNTIF(rng,rng)),0))

Note the changes in this segment which occurs twice in each formula:

(rng<>D1)*rng

(rng<>D1)*(rng<>D2)*rng

(rng<>D1)*(rng<>D2)*(rng<>D3)*rng

It seems there should be some method of automating this change, but I have not
stumbled upon it as yet.



--ron
 
H

Harlan Grove

Ron Rosenfeld wrote...
....
The following, somewhat cumbersome, single cell solution might work.
Unfortunately, I have not figured out how to just enter a single formula and
drag it down as required.
....

A single formula would be tricky if the topmost formula (the first
mode) could be in row 1. The simplest way to deal with that may be
found in Dominic's response.

As for a single cell formula that returned the n_th most frequently
occurring number in a multiple row, single column range, rng, which
contained numbers in every cell, try the array formula

=IF(n<=SUMPRODUCT(1/COUNTIF(rng,rng)),INDEX(rng,
MATCH(LARGE(FREQUENCY(rng,rng)-ROW($A$1:INDEX($A:$A,ROWS(rng)+1))/100000,n),
FREQUENCY(rng,rng)-ROW($A$1:INDEX($A:$A,ROWS(rng)+1))/100000,0)),"")
 
R

Ron Rosenfeld

Ron Rosenfeld wrote...
...
...

A single formula would be tricky if the topmost formula (the first
mode) could be in row 1. The simplest way to deal with that may be
found in Dominic's response.

As for a single cell formula that returned the n_th most frequently
occurring number in a multiple row, single column range, rng, which
contained numbers in every cell, try the array formula

=IF(n<=SUMPRODUCT(1/COUNTIF(rng,rng)),INDEX(rng,
MATCH(LARGE(FREQUENCY(rng,rng)-ROW($A$1:INDEX($A:$A,ROWS(rng)+1))/100000,n),
FREQUENCY(rng,rng)-ROW($A$1:INDEX($A:$A,ROWS(rng)+1))/100000,0)),"")

That's nice; it works; and now I have to figure out how!

Thanks.


--ron
 
R

Rothman

Option 1 causes a circular reference error (putting references to B2 in cell
B2); I think you meant B1 instead of B2 in the formula.

Then again, I might be wrong altogether.

Domenic said:
Assuming that A2:A10 contains your data, here are two options...

[Option 1]

B1:

=MODE(A2:A10)

B2, copied down:

=MODE(IF(COUNTIF(B$2:B2,A$2:A$10)=0,A$2:A$10))

....confirmed with CONTROL+SHIFT+ENTER, not just ENTER.

[Option 2]

This option will generate a Top N list (Top 1, Top 3, Top 5, etc.),
where you choose N, and which will take into consideration ties for Nth
place...

B2, copied down:

=IF(ISNA(MATCH(A2,A$1:A1,0)),COUNTIF(A2:A$10,A2),"")

C2, copied down:

=IF(N(B2),RANK(B2,B$2:B$10)+COUNTIF(B$2:B2,B2)-1,"")

D1: enter 3, indicating you want a Top 3 list

*Enter the Top N list of interest

E1:

=MAX(IF(B2:B10=INDEX(B2:B10,MATCH(D1,C2:C10,0)),C2:C10))-D1

....confirmed with CONTROL+SHIFT+ENTER, not just ENTER.

F2, copied down:

=IF(ROWS(F$2:F2)<=$D$1+$E$1,INDEX(A$2:A$10,MATCH(ROWS(F$2:F2),$C$2:$C$10,
0)),"")

Hope this helps!

Laffin said:
I have a large data set. Within the column I have determined the mode. How
do I find the second most common number, third most common, etc.?

240
240
240
240
240
240
240
240
288
288
288
300
 
D

Domenic

Thanks for catching that error! The first formula should be entered in
B2, and the second one in B3.

Thanks again! Much appreciated!

Rothman said:
Option 1 causes a circular reference error (putting references to B2 in cell
B2); I think you meant B1 instead of B2 in the formula.

Then again, I might be wrong altogether.

Domenic said:
Assuming that A2:A10 contains your data, here are two options...

[Option 1]

B1:

=MODE(A2:A10)

B2, copied down:

=MODE(IF(COUNTIF(B$2:B2,A$2:A$10)=0,A$2:A$10))

....confirmed with CONTROL+SHIFT+ENTER, not just ENTER.

[Option 2]

This option will generate a Top N list (Top 1, Top 3, Top 5, etc.),
where you choose N, and which will take into consideration ties for Nth
place...

B2, copied down:

=IF(ISNA(MATCH(A2,A$1:A1,0)),COUNTIF(A2:A$10,A2),"")

C2, copied down:

=IF(N(B2),RANK(B2,B$2:B$10)+COUNTIF(B$2:B2,B2)-1,"")

D1: enter 3, indicating you want a Top 3 list

*Enter the Top N list of interest

E1:

=MAX(IF(B2:B10=INDEX(B2:B10,MATCH(D1,C2:C10,0)),C2:C10))-D1

....confirmed with CONTROL+SHIFT+ENTER, not just ENTER.

F2, copied down:

=IF(ROWS(F$2:F2)<=$D$1+$E$1,INDEX(A$2:A$10,MATCH(ROWS(F$2:F2),$C$2:$C$10,
0)),"")

Hope this helps!

Laffin said:
I have a large data set. Within the column I have determined the mode.
How
do I find the second most common number, third most common, etc.?

240
240
240
240
240
240
240
240
288
288
288
300
 
Top