Regression Formulas

J

Julie

I am trying to find the actual calculations that Excel
uses for Regression analysis.
 
A

Alex Delamain

It uses a standard linear regression analysis
for y = mx +c

Using the following data
x y
50 1.64
60 1.65
70 1.67
80 1.72
90 1.72
100 1.74
50 1.6
60 1.63
70 1.67
80 1.7
90 1.71
100 1.71

m
{SUMPRODUCT((A1:A12)*(B1:B12))}-(COUNT(B1:B12)*AVERAGE(A1:A12)*AVERAGE(B1:B12))
divided by
{SUM((A1:A12)^2)}-(COUNT(B1:B12)*I19^2)

in this case 0.00226

rearrange y = mx+c so c = y-mx

then c = average(b1:b12)-0.00226*average(a1:a12)

which is why most people just let Excel get on with it
 
J

Jerry W. Lewis

Excel fits equations by least squares. STEYX(), RSQ(), LINEST(,,,TRUE),
and the ATP Regression tool perform standard statistical analyses based
on the least squares fit.

Help for SLOPE(), INTERCEPT(), STEYX(), RSQ(), and LINEST() gives
formulas. The ATP regression tool uses LINEST(). You don't mention
your version of Excel. Excel 2003 uses mathematically equivalent but
numerically superior formulas to earlier versions (documented in Help).
The formulas that Alex gave applay to pre-2003 versions.

If you mean multiple linear regression (instead of simple linear
regression), then LINEST in versions prior to 2003 directly solves the
normal equations
b = MMULT(MINVERSE(XpX),Xpy)
where XpX = MMULT(TRANSPOSE(X),X) and Xpy = MMULT(TRANSPOSE(X),y)
for y the vector of observations, X the X-matrix, and b the vector of
estimates. Excel 2003 uses a more complicated algorithm
http://support.microsoft.com/default.aspx?scid=kb;en-us;828533
That in most cases is far better numerically than older versions, but on
occasion may zero estimates that should not be zeroed
http://groups.google.com/groups?threadm=412980D4.5040305@no_e-mail.com

Jerry
 
A

Alex Delamain

um yeah well uh right. Whatever he says!

I thought my explanation was clear as mud but I have to concede defea
at this point!

From now on I am sticking to + - x and
 
Top