Calculation of slope when the trendline has a forced intercept

M

mpanic

Version: 2004

When you set the intercept for a linear trendline, what equation does excel use to find the slope?

It doesn't use the normal statistical equations to find the slope because that gives a different value for the intercept.

I'm attempting to recreate the values for the slope in a second program to verify the accuracy, but without knowing the equation Excel used I can't tell if it is a reasonable assumption.
 
M

Mike Middleton

mpanic -

Could you post the XY data and steps to replicate the problem?

How are you determining "a different value for the intercept"?

What is the "second program"?

- Mike
 
M

mpanic

When I graph the following points under a normal linear fit Excel gives:
y = 0.2217x + 13.017

However the theory behind the data requires the data to pass through (0,0). So when I force the intercept at 0, Excel gives:
y = 0.4464x

These slopes are drastically different and therefore I need to figure out how excel calculates this slope. Once I know the mathematical equation (and confirm its statistical accuracy) I'm going to plug it into IDL to make sure it works out, then use the equation to examine other similar data sets.

Here is the smallest data set:

x y
14.471828 1.8674022
24.912722 1.8221583
36.662433 0.99187654
37.439198 10.775904
50.27423 30.192816
9.8574726 8.4008312
27.238668 14.85525
35.909738 15.032314
35.341766 10.933105
35.684381 16.322607
39.49668 8.519908
44.334297 14.73851
20.587585 13.369005
40.197601 11.596133
34.92353 2.6415801
10.361215 11.489348
19.495221 11.467255
32.292305 9.8245344
6.1976713 11.445314
18.777945 9.5765905
22.909437 8.9665155
22.190435 8.7356281
4.0026048 7.2871704
69.05893 34.650879
23.109815 19.384487
31.67575 11.510677
60.353394 43.695309
12.814069 12.570518
61.286047 4.4271402
28.447608 15.381767
56.477149 9.8492346
7.4105175 10.165609
16.167889 7.570941
22.11201 9.18962
15.871093 8.4511671
25.685305 7.2254281
31.207207 7.9234939
16.370049 3.6062489
10.236147 7.5680122
51.070948 8.0655022
87.69665 32.282848
122.22652 14.464322
4.5701745 11.791895
8.3624912 11.831852
40.943345 8.7775545
47.153669 16.767218
18.554417 26.803871
17.484731 24.875601
3.7361318 10.343881
3.0756457 8.4870005
36.947764 6.4626908
34.618637 20.80641
76.952972 21.611946
91.945002 83.877892
73.231651 9.0157328
61.886657 23.488962
14.224904 9.3669176
29.340148 6.0199032
8.0183386 6.7389908
10.661287 7.7080841
11.481818 15.605347
38.06275 10.299226
17.160036 15.948554
7.9977566 8.6500998
55.456081 40.757271
73.894692 47.819904
0.9792885 12.294238
2.026356 8.4945488
8.1572825 15.634866
7.4266597 8.6752195
10.423531 7.0669436
35.934836 17.044312
55.699679 15.087641
4.0438051 14.873123
24.778494 21.255373
49.066754 23.243662
126.82929 63.767708
42.249973 11.496975
50.304019 14.416771
192.16706 38.684147
0.15649499 12.247282
44.74404 27.93058
49.239932 14.748042
19.807283 17.340172
14.957553 15.902395
67.34096 35.329285
21.244788 10.720654
81.786983 44.043346
64.588024 26.836967
105.91237 19.617113
33.160804 90.805107
40.084872 35.149796
93.616 27.232908
86.794276 19.95326
26.643102 27.804993
43.152313 66.484367
14.623029 17.988043
20.996517 72.085922
9.7927663 7.6655998
44.602463 9.2815609
48.853613 47.321587
21.570549 35.422676
66.454903 15.334218
5.31036 13.780963
8.6740643 10.198942
19.259322 16.405973
25.626591 14.931253
36.334121 32.377167
42.932285 16.366325
61.594055 58.486282
26.3384 5.4955268
31.166072 21.033941
26.484818 13.190408
26.727565 127.17951
45.795584 40.335423
8.8996297 32.565891
34.030142 11.759562
13.422069 17.265682
12.86183 14.609124
61.309485 19.357042
56.482075 11.943833
40.074146 19.505463
58.001414 31.683081
35.914144 64.344124
56.166478 16.06694
46.935538 91.614357
44.614508 26.575903
21.378473 17.969112
35.655528 35.261982
50.145677 25.486937
48.595161 10.610749
19.331018 20.836975
61.588573 9.4353056
4.7768703 28.920082
23.764304 15.405479
36.219456 42.699306
27.412884 17.973738
6.6937244 9.8768263
 
M

mpanic

It didn't post very well so I can put the x's and y's in separate posts. Here are the X's:

14.471828
24.912722
36.662433
37.439198
50.27423
9.8574726
27.238668
35.909738
35.341766
35.684381
39.49668
44.334297
20.587585
40.197601
34.92353
10.361215
19.495221
32.292305
6.1976713
18.777945
22.909437
22.190435
4.0026048
69.05893
23.109815
31.67575
60.353394
12.814069
61.286047
28.447608
56.477149
7.4105175
16.167889
22.11201
15.871093
25.685305
31.207207
16.370049
10.236147
51.070948
87.69665
122.22652
4.5701745
8.3624912
40.943345
47.153669
18.554417
17.484731
3.7361318
3.0756457
36.947764
34.618637
76.952972
91.945002
73.231651
61.886657
14.224904
29.340148
8.0183386
10.661287
11.481818
38.06275
17.160036
7.9977566
55.456081
73.894692
0.9792885
2.026356
8.1572825
7.4266597
10.423531
35.934836
55.699679
4.0438051
24.778494
49.066754
126.82929
42.249973
50.304019
192.16706
0.15649499
44.74404
49.239932
19.807283
14.957553
67.34096
21.244788
81.786983
64.588024
105.91237
33.160804
40.084872
93.616
86.794276
26.643102
43.152313
14.623029
20.996517
9.7927663
44.602463
48.853613
21.570549
66.454903
5.31036
8.6740643
19.259322
25.626591
36.334121
42.932285
61.594055
26.3384
31.166072
26.484818
26.727565
45.795584
8.8996297
34.030142
13.422069
12.86183
61.309485
56.482075
40.074146
58.001414
35.914144
56.166478
46.935538
44.614508
21.378473
35.655528
50.145677
48.595161
19.331018
61.588573
4.7768703
23.764304
36.219456
27.412884
6.6937244
 
M

mpanic

and the y's:

1.8674022
1.8221583
0.99187654
10.775904
30.192816
8.4008312
14.85525
15.032314
10.933105
16.322607
8.519908
14.73851
13.369005
11.596133
2.6415801
11.489348
11.467255
9.8245344
11.445314
9.5765905
8.9665155
8.7356281
7.2871704
34.650879
19.384487
11.510677
43.695309
12.570518
4.4271402
15.381767
9.8492346
10.165609
7.570941
9.18962
8.4511671
7.2254281
7.9234939
3.6062489
7.5680122
8.0655022
32.282848
14.464322
11.791895
11.831852
8.7775545
16.767218
26.803871
24.875601
10.343881
8.4870005
6.4626908
20.80641
21.611946
83.877892
9.0157328
23.488962
9.3669176
6.0199032
6.7389908
7.7080841
15.605347
10.299226
15.948554
8.6500998
40.757271
47.819904
12.294238
8.4945488
15.634866
8.6752195
7.0669436
17.044312
15.087641
14.873123
21.255373
23.243662
63.767708
11.496975
14.416771
38.684147
12.247282
27.93058
14.748042
17.340172
15.902395
35.329285
10.720654
44.043346
26.836967
19.617113
90.805107
35.149796
27.232908
19.95326
27.804993
66.484367
17.988043
72.085922
7.6655998
9.2815609
47.321587
35.422676
15.334218
13.780963
10.198942
16.405973
14.931253
32.377167
16.366325
58.486282
5.4955268
21.033941
13.190408
127.17951
40.335423
32.565891
11.759562
17.265682
14.609124
19.357042
11.943833
19.505463
31.683081
64.344124
16.06694
91.614357
26.575903
17.969112
35.261982
25.486937
10.610749
20.836975
9.4353056
28.920082
15.405479
42.699306
17.973738
9.8768263
 
C

Carl Witthoft

Well, calculating the least-squares fit to y = mx as opposed to
y = b+ mx
is not that hard.
Here's an example from a random paper I grabbed via Google:

(9.4) [Fundamental assumptions] Consider the two-variable model without
an intercept, that is y = bx + e . Notice that there is only one b
parameter for this model‹the slope. There is no intercept because the
true population regression function is here assumed to pass through the
origin. It can be shown that the least-squares estimator of b is
given by ( sum(x(i)*y(i))/sum(x(i)^2) . where the sums run from i=1
to n. Assume that E(ei | xi) = 0 . This means that you can treat the
xi as fixed numbers, not random variables, and that E(ei) = 0.

see the original at
http://www.drake.edu/cbpa/econ/boal/107/ps/ps09.doc
 
J

JE McGimpsey

Version: 2004

When you set the intercept for a linear trendline, what equation does excel
use to find the slope?

Take a look at LINEST() in Help for an explanation of XL's calculations.
 
M

Mike Middleton

mpanic -
When I graph the following points under a normal linear fit Excel gives:
y = 0.2217x + 13.017 <

Yes, but the plot shows a nearly random pattern, and R-square (proportion of
variation in Y explained by X) is only 0.105. So, if you force the line
through (0,0), the explained variation will be even less.
However the theory behind the data requires the data to pass through
(0,0). So when I force the intercept at 0, Excel gives: y = 0.4464x <

Yes, I get the same result (0.446366056949337 using all 15 of Excel's
significant digits) using three methods: Add Trendline, array-entered LINEST
worksheet function, and a formula in a worksheet cell.

With your X data in A2:A139 and Y data in B2:B139, an appropriate formula in
a worksheet cell is
=SUMPRODUCT(A2:A139,B2:B139)/SUMSQ(A2:A139)
which is what I used for this special case (intercept equal to zero).
These slopes are drastically different and therefore I need to figure out
how excel calculates this slope. <

Yes, of course, the slopes are different, as expected.

It is unlikely the Microsoft programmer who wrote the trendline code will
ever appear in this newsgroup, so I don't think we'll ever know what the
trendline algorithm is.
Once I know the mathematical equation (and confirm its statistical
accuracy) I'm going to plug it into IDL to make sure it works out, then
use the equation to examine other similar data sets. <

What do you mean by "confirm its statistical accuracy"?

- Mike Middleton
http://www.DecisionToolworks.com
Decision Analysis Add-ins for Excel
 
M

mpanic

Mike,

Thanks a ton for your help! I'm aware that the fit for the line is very poor, but my goal is to compare slopes of multiple data sets and the accuracy of the line isn't as relevant. It of course still matters, but the slope is the important data that I need.

What I meant by "compare statistical accuracy" is that this is for a scientific paper (the random data points actually correspond to energy of space weather particles versus the perturbation of the Earth's magnetic field) and I just wanted to make sure Excel was actually using some sort of legitimate statistical equation and not something like putting a bunch of points at (0,0) to make it pass through there.

I shall attempt your suggestions when I get back to the lab on Monday, until then I think my question is answered.

-mpanic
 
C

Carl Witthoft

These slopes are drastically different and therefore I need to figure out
how excel calculates this slope. <

Yes, of course, the slopes are different, as expected.

It is unlikely the Microsoft programmer who wrote the trendline code will
ever appear in this newsgroup, so I don't think we'll ever know what the
trendline algorithm is.[/QUOTE]

well, yeah but.... since the answers the Trendline button produces are
the same as LINEST (for poly fits) to machine precision, I think we can
make a pretty good guess :).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top