Basic Statistics

A

arceaf

Hi,

I am using statistics to develop a finacial model on excel.

I have created a table that lists returns of a security over a give
period of time. I have used formulas to calculate the mean and standar
dev of the returns.

Right now, I am using the norm.inv function =norm.inv(.05, Standard Dev
Mean) to come up with the minimum return @ a 95% confidence.

Now, I want to come up with the MAX return at 95% confidence, but don'
know which formula I should use:

Should I use:

NORM.INV(0.95,Mean,Standard dev)
or
NORM.DIST(0.05,Mean,Standard dev,TRUE)

I really appreciate the help
 
J

joeu2004

arceaf said:
I have created a table that lists returns of a security
over a given period of time. I have used formulas to
calculate the mean and standard dev of the returns.

Right now, I am using the norm.inv function
=norm.inv(.05, Standard Dev,Mean)
to come up with the minimum return @ a 95% confidence.

That is incorrect. Ignoring the typo, that does not determine a limit with
"95% confidence".


arceaf said:
Now, I want to come up with the MAX return at 95% confidence,
but don't know which formula I should use:
NORM.INV(0.95,Mean,Standard dev)
or
NORM.DIST(0.05,Mean,Standard dev,TRUE)

NORMDIST returns a probability, not a value in the distribution.

__If__ the security returns [sic] are normally distributed (TBD), 95% of the
security returns are between:

minRtn: =NORMINV(2.5%,mean,sd)
maxRtn: =NORMINV(97.5%,mean,sd)

However, that does __not__ mean we are 95% confident of those limits.

Usually, the confidence interval is around the mean.

First, we must calculate the std error of the mean:

se: =sd/SQRT(n)

That assumes sd is the __sampling__ std dev, using STDEV or STDEV.S.

Then the 95% confidence interval around the mean is:

minMean: =mean-NORMSINV(97.5%)*se
maxMean: =mean+NORMSINV(97.5%)*se

That assumes n is "large".

If n is "small", we should use the Student's t-distribution. Then the 95%
confidence interval around the mean is:

minMean: =mean-TINV(5%,n-1)*se
maxMean: =mean+TINV(5%,n-1)*se

Note: There is no agreement about what "large" and "small" are. But for
n>=475, both NORMSINV(97.5%) and TINV(5%,n-1) are 1.96 when rounded to 2
decimal places, the common estimate of the multiplier for the 95% confidence
interval. On the other hand, there is no harm in always using TINV.

We might substitute minMean and maxMean into the minRtn and maxRtn formulas
above to determine the 95% confidence interval __for_each__.

But that is __not__ to say that we are 95% confident that 95% of the
security returns are between the smallest minRtn and the largest maxRtn, for
example.

To be honest, I am not sure how to specify a single range for 95% of the
security returns with 95% confidence. That is beyond (my) "basic
statistics". :)

-----

Nevertheless, the minRtn and maxRtn formulas are predicated on the
__assumption__ that the security returns [sic] are normally distributed.

(That assumption is not required for the minMean and maxMean formulas, due
to the Central Limit Theorem.)

There are a variety of tests for normality. Not everyone agrees on which is
best, AFAIK.

I would simply create a histogram of the sample data and compare it to a
normal distribution the data to see if they are "close". Arguably, that is
subjective and error-prone.

However, "conventional wisdom" is that __log__ returns are normally
distributed, not __simple__ returns. I don't know which you mean by
"security returns".

And usually, that expectation (normal distribution) applies to an asset
__class__, not necessarily an individual security.

If the security returns are not normally distributed, I would use the
following formulas if we have the original sample data:

minRtn: =SMALL(A1:An,n*2.5%)
maxRtn: =SMALL(A1:An,n*97.5%)

where A1:An represents the range of n data starting in A1.

Does that help? Or TMI?
 
A

arceaf

'joeu2004[_2_ said:
;1613576']"arceaf said:
I have created a table that lists returns of a security
over a given period of time. I have used formulas to
calculate the mean and standard dev of the returns.

Right now, I am using the norm.inv function
=norm.inv(.05, Standard Dev,Mean)
to come up with the minimum return @ a 95% confidence.-

That is incorrect. Ignoring the typo, that does not determine a limi
with
"95% confidence".


arceaf said:
Now, I want to come up with the MAX return at 95% confidence,
but don't know which formula I should use:
NORM.INV(0.95,Mean,Standard dev)
or
NORM.DIST(0.05,Mean,Standard dev,TRUE)-

NORMDIST returns a probability, not a value in the distribution.

__If__ the security returns [sic] are normally distributed (TBD), 95% o
the
security returns are between:

minRtn: =NORMINV(2.5%,mean,sd)
maxRtn: =NORMINV(97.5%,mean,sd)

However, that does __not__ mean we are 95% confident of those limits.

Usually, the confidence interval is around the mean.

First, we must calculate the std error of the mean:

se: =sd/SQRT(n)

That assumes sd is the __sampling__ std dev, using STDEV or STDEV.S.

Then the 95% confidence interval around the mean is:

minMean: =mean-NORMSINV(97.5%)*se
maxMean: =mean+NORMSINV(97.5%)*se

That assumes n is "large".

If n is "small", we should use the Student's t-distribution. Then th
95%
confidence interval around the mean is:

minMean: =mean-TINV(5%,n-1)*se
maxMean: =mean+TINV(5%,n-1)*se

Note: There is no agreement about what "large" and "small" are. Bu
for
n>=475, both NORMSINV(97.5%) and TINV(5%,n-1) are 1.96 when rounded to

decimal places, the common estimate of the multiplier for the 95
confidence
interval. On the other hand, there is no harm in always using TINV.

We might substitute minMean and maxMean into the minRtn and maxRt
formulas
above to determine the 95% confidence interval __for_each__.

But that is __not__ to say that we are 95% confident that 95% of the
security returns are between the smallest minRtn and the largest maxRtn
for
example.

To be honest, I am not sure how to specify a single range for 95% of th

security returns with 95% confidence. That is beyond (my) "basic
statistics". :)

-----

Nevertheless, the minRtn and maxRtn formulas are predicated on the
__assumption__ that the security returns [sic] are normall
distributed.

(That assumption is not required for the minMean and maxMean formulas
due
to the Central Limit Theorem.)

There are a variety of tests for normality. Not everyone agrees o
which is
best, AFAIK.

I would simply create a histogram of the sample data and compare it to

normal distribution the data to see if they are "close". Arguably, tha
is
subjective and error-prone.

However, "conventional wisdom" is that __log__ returns are normally
distributed, not __simple__ returns. I don't know which you mean by
"security returns".

And usually, that expectation (normal distribution) applies to an asse

__class__, not necessarily an individual security.

If the security returns are not normally distributed, I would use the
following formulas if we have the original sample data:

minRtn: =SMALL(A1:An,n*2.5%)
maxRtn: =SMALL(A1:An,n*97.5%)

where A1:An represents the range of n data starting in A1.

Does that help? Or TMI?



This has been extremely helpful. I am assuming that no, the returns o
the security are not normally distributed, therefore I will use thes
Formulas:

minRtn: =SMALL(A1:An,n*2.5%)
maxRtn: =SMALL(A1:An,n*97.5%)

When I mean "Security Returns" I mean the gains or loss experience b
the security from one day to the next. For example, if the security i
at 5.25 on Day 1, and then rises to 5.50 on Day 2, then the stock
"returned" 4.76%.

What I am attempting to build is a model that will provide me with the
"best case" and "Worse Case" scenario based on the returns on the
security. I have the daily returns of the security for the past three
years.

From what I have understood, using your formula, I can now determine the
minimum return and maximum return with 95% probability.

Am I correct in saying this?

Much appreciate your help
 
J

joeu2004

arceaf said:
When I mean "Security Returns" I mean the gains or loss experience by
the security from one day to the next. For example, if the security is
at 5.25 on Day 1, and then rises to 5.50 on Day 2, then the stock
"returned" 4.76%.

That is called the "simple return".


arceaf said:
What I am attempting to build is a model that will provide me with the
"best case" and "Worse Case" scenario based on the returns on the
security. I have the daily returns of the security for the past three
years.

And those are the typical terms and calculations that online financial
calculators determine, for example the retirement income calculators at
Schwab and Fidelity.


arceaf said:
From what I have understood, using your formula, I can now
determine the minimum return and maximum return with 95%
probability.

Am I correct in saying this?

Close! And probably good enough.

I would say: there is a 95% probability that returns are between the min
and max.

"Based on historical returns. But past performance does not necessarily
predict the future." ;-)

Technically, we should say "between the min and max __centered__ around the
mean".

However, although that it is our intent, the mean of the calculated 95%
range might be different. (Usually not by much, though.)

So we should recalculate the mean using the following array-entered formula
(press ctrl+shift+Enter instead of just Enter):

=AVERAGE(IF(A1:A1095>=B1,IF(A1:A1095<=B2,A1:A1095)))

where B1 is your calculated min, and B2 is your calculated max.

-----

Second thoughts....

1. I don't know why I wrote SMALL(A1:An,n*2.5%) and SMALL(A1:An,n*97.5%).

The more "obvious" expressions are PERCENTILE(A1:An,2.5%) and
PERCENTILE(A1:An,97.5%).

There is a subtle difference that might favor the use of SMALL instead
PERCENTILE. SMALL will always returns a value in the range A1:An.
PERCENTILE might interpolate.


2. I probably should write SMALL(A1:An,ROUNDUP(n*2.5%)) and
SMALL(A1:An,ROUNDDOWN(n*97.5%)), to be conservative.
 
J

joeu2004

Errata said:
2. I probably should write SMALL(A1:An,ROUNDUP(n*2.5%))
and SMALL(A1:An,ROUNDDOWN(n*97.5%)), to be conservative.

Let's try that again:

2. I probably should write SMALL(A1:An,ROUNDUP(n*2.5%,0))
and SMALL(A1:An,ROUNDDOWN(n*97.5%,0)), to be conservative.

Flipping between Excel and VBA is very confusing.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top