QUARTILE

M

Mycroft Holmes

Hi to all,

does anyone know which algorithm Excel 2000 uses for computing
quartiles?

For example: quartiles of 1;2;3;4 are

1
1.75
2.5
3.25
4

Even if all five numbers are definetly correct, 1.75 and 3.25 are arbitrary.
For example, our c++ program currently gives 1.5 and 3.5 and we are curious
to understand the difference.
We performed some numerical experiment; our first conjecture was that the
formula looks like "bracketing" the result between two cells and then
returning a weighted mean, based on quartile/percentile (i.e. 1/2A+1/2B for
median, 3/4+1/4 for quartile, etc.), but this fails for 6 numbers.

Thanks in advance.
 
A

Anon

Mycroft Holmes said:
Hi to all,

does anyone know which algorithm Excel 2000 uses for computing
quartiles?

For example: quartiles of 1;2;3;4 are

1
1.75
2.5
3.25
4

Even if all five numbers are definetly correct, 1.75 and 3.25 are arbitrary.
For example, our c++ program currently gives 1.5 and 3.5 and we are curious
to understand the difference.
We performed some numerical experiment; our first conjecture was that the
formula looks like "bracketing" the result between two cells and then
returning a weighted mean, based on quartile/percentile (i.e. 1/2A+1/2B for
median, 3/4+1/4 for quartile, etc.), but this fails for 6 numbers.

Thanks in advance.

I don't know what your C++ program is doing, but to me a lower quartile is a
quarter of the way from the smallest value towards the largest. With the
numbers 1, 2, 3 and 4, the range (largest - smallest) is 3. Therefore the
lower quartile is 1 + 3/4, which is 1.75.
 
M

Myrna Larson

The Microsoft Knowledge Base has information on this. JE has pointed you to the specific
article. There's a wealth of information there. Check it out: www.support.microsoft.com and
click on the link to "Search the Knowledge Base".
 
M

Mycroft Holmes

I don't know what your C++ program is doing, but to me a lower quartile is a
quarter of the way from the smallest value towards the largest. With the
numbers 1, 2, 3 and 4, the range (largest - smallest) is 3. Therefore the
lower quartile is 1 + 3/4, which is 1.75.

wrong definition. more or less, the p-percentile is any number which is
greater than p% of the sample points and less than the remaining (1-p)%
the quartile is the 25%-percentile.
 
M

Myrna Larson

How does your comment fit with MS's description of how it calculates this function? For the
example given (data 1,2,3,4), QUARTILE(A1:A4,1) does in fact give 1.75, a number which isn't in
the data set.
 
M

Myrna Larson

Since we're getting picky about definitions, I'll throw in some comments.

I guess you're in good company with your description since it's close to what Galton said (from
the OED):

"1885 F. GALTON in Jrnl. Anthrop. Inst. Feb. 276 The value which 50 per cent. exceeded, and 50
per cent. fell short of, is the Median Value, or the 50th per-centile"

But I have a problem with both of you <g>. If that definition and yours are correct, the
specified percentile value cannot be a value that actually occurs in the dataset. If it does
occur, there will not be p% of the values less than it and (1-p)% of the values greater than it.
If N is the percentile value, that definition says that 100% of the values are either less than
N or greater than N. There are none equal to N.

So what is the 50th percentila value for the 101 integers in the range 1-101? The Excel formula
says its 51. That's the median, but it doesn't meet the requirements that 50.5 values are less
than it and 50.5 are greater than it.

Unless of course you say that half of the value 51 is less than 51 and half of the value 51 is
 
H

Harlan Grove

Myrna Larson said:
But I have a problem with both of you <g>. If that
definition and yours are correct, the specified
percentile value cannot be a value that actually
occurs in the dataset. If it does occur, there will
not be p% of the values less than it and (1-p)% of
the values greater than it. If N is the percentile
value, that definition says that 100% of the values
are either less than N or greater than N. There are
none equal to N.
....

OED is not the best place to go for precise definitions of mathematical
terminology. Median of finite samples is always subject to the potential
problem of clustering at the value closest to the 50% percentile. For
continuous distributions, it's not a problem. There's a large body of work
dealing with how to handle finite samples, none of which is likely to be
found in the OED. Try the Encyclopaedia Britannica.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top