Calc help!

ga14

Member
Nov 17, 2002
42
0
0
Hey Guys,

I need help with following question:

"Suppose you are given a set of ten data points {(x1, y1), (x2,y2), (x3, y3)...(x10,y10)} and you want to find the "best fit" line to the data. In statistics the best fit line (often called the least squares line) is defined to be the line y=mx+b that minimizes the sum of the squares of the differences between the line and the data points. More precisely, the line of best fit is the line y=mx+b that minimizes the Rieman sum from k=1 to n of (y_k-(mx_k+b))^2
where _k denotes "sub k".

In this problem, you can assume n=10. Use partial differentiation to show that the line of best fit has slope _______
and y intercept ________."

Thanks for any help at all!


 

uart

Member
May 26, 2000
174
0
0
WTF? Do you own homework...

Hehe, maybe this should be renamed to the "Highly Technical and Homework" forum. :)


GA14, the problem is that you've pretty much just posted the problem as given and basically asked someone to do the whole thing for you. I'm sure people would be more helpful if you told us where abouts you had gotten up to and then asked for some specific help on how to proceed.

I wont do you homework for you but I'll give you some suggestions for getting started.

1. How about you start with just three data points (the min required for a non-trivial solution). This will allow you to write out all the terms explicitly without any notational complexity.

2. Write out the sum of squared errors for the above.

3. Try to work out which variables you need to partial differentiate WRT in order to get something that will lead to a solution for m and b.

If you can do the above you'll be more than half way there. If not then post back and tell where abouts you got "stuck".

 

ga14

Member
Nov 17, 2002
42
0
0
I just took the partials with respect to b and m for that equation. Isn't that all you have to do?
 

uart

Member
May 26, 2000
174
0
0
Originally posted by: ga14
I just took the partials with respect to b and m for that equation. Isn't that all you have to do?

Yep that's right. Now just set the partial derivatives to zero and factor out the m and b. Remember the all the sum terms (in xi and yi) are just constants, so lump them togther and give them suitable symbols like SSX (sum of squared x), SX (sum of x) and SXY (sum of x times y) for example. Now you'll see you've just got two simulatneous equations in two unknowns - easy, solve it for m and b.

When you extend it to 10 terms it's not really any more difficult, it will still reduce to two simult eqn in m and b, only the "constant terms" will be more complicated though still of the same form.