Summary: I want to confirm the three sum-defined quantities in https://github.com/barrycarter/bcapps/blob/master/STACK/planetest.m (also below) are identically zero for all values of n>=3.
While attempting to solve https://stats.stackexchange.com/questions/196655 (fitting points to a plane), I came up with these (probably either wrong or previously derived by someone else) formulas for a,b,c such that z=a*x+b*y+c is a best fit for n points x[i], y[i], and z[i]:
a =
-((Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[x[i], {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[y[i], {i, 1, n}]^2*Sum[x[i]*z[i], {i, 1, n}] +
n*Sum[y[i]^2, {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] +
Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}] -
n*Sum[x[i]*y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 -
2*Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]))
b =
-((-(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}]) +
Sum[x[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}] +
Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] -
n*Sum[x[i]*y[i], {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] -
Sum[x[i], {i, 1, n}]^2*Sum[y[i]*z[i], {i, 1, n}] +
n*Sum[x[i]^2, {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 -
2*Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]))
c =
(Sum[x[i]*y[i], {i, 1, n}]^2*Sum[z[i], {i, 1, n}] -
Sum[x[i]^2, {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] +
Sum[x[i], {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] +
Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}] -
Sum[x[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 -
2*Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*Sum[y[i]^2, {i, 1, n}])
To confirm these values, I'd compute the sum of the differences squared. Each term would look like this:
diff[i_] = (z[i]-(a*x[i]+b*y[i]+c))^2
Treating the sum as a function of a,b,c, I would take partials with respect to these three variables and set equal to 0.
Since derivatives add, I would be adding the sum of the derivatives of each term:
derva[i_] = -2*x[i]*(-c - a*x[i] - b*y[i] + z[i])
dervb[i_] = -2*y[i]*(-c - a*x[i] - b*y[i] + z[i])
dervc[i_] = -2*(-c - a*x[i] - b*y[i] + z[i])
and setting each sum equal to 0.
Mathematica won't solve that for arbitrary n (which I sort of expected):
Solve[{
Sum[derva[i],{i,1,n}] == 0,
Sum[dervb[i],{i,1,n}] == 0,
Sum[dervc[i],{i,1,n}] == 0
}, {a,b,c}]
Out[74] = {}
and Reduce doesn't help either. Keeping the derivative outside the sum doesn't work either, albeit with a different error message (the standard Solve::nsmet: This system cannot be solved with the methods available to Solve.).
Mathematica will solve for a,b,c for specific values of n, which led me to the guess above.
To test my guess, I need to confirm that each partial derivative is zero. In other words, I need to confirm:
Sum[-2*x[i]*(-c - a*x[i] - b*y[i] + z[i]),{i,1,n}] == 0
Sum[-2*y[i]*(-c - a*x[i] - b*y[i] + z[i]),{i,1,n}] == 0
Sum[-2*(-c - a*x[i] - b*y[i] + z[i]),{i,1,n}] == 0
are identically zero for all values of x[i], y[i], z[i], and n, when I put in my guesses above.
In other words (these are the big ugly equations which I'm intentionally giving in "full" form for those who don't want to read the rest of this question)...:
atest[n_] :=
Sum[-2*x[i]*(-((Sum[x[i]*y[i], {i, 1, n}]^2*Sum[z[i], {i, 1, n}] -
Sum[x[i]^2, {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*
Sum[x[i]*z[i], {i, 1, n}] + Sum[x[i], {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*
Sum[x[i]*z[i], {i, 1, n}] + Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]*
Sum[y[i]*z[i], {i, 1, n}] - Sum[x[i], {i, 1, n}]*
Sum[x[i]*y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 - 2*Sum[x[i], {i, 1, n}]*
Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*
Sum[y[i]^2, {i, 1, n}])) +
((Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[x[i], {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[y[i], {i, 1, n}]^2*Sum[x[i]*z[i], {i, 1, n}] +
n*Sum[y[i]^2, {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] +
Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}] -
n*Sum[x[i]*y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])*x[i])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 -
2*Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*
Sum[y[i]^2, {i, 1, n}]) +
((-(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}]) +
Sum[x[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}] +
Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] -
n*Sum[x[i]*y[i], {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] -
Sum[x[i], {i, 1, n}]^2*Sum[y[i]*z[i], {i, 1, n}] +
n*Sum[x[i]^2, {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])*y[i])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 -
2*Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*
Sum[y[i]^2, {i, 1, n}]) + z[i]), {i, 1, n}];
btest[n_] :=
Sum[-2*y[i]*(-((Sum[x[i]*y[i], {i, 1, n}]^2*Sum[z[i], {i, 1, n}] -
Sum[x[i]^2, {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*
Sum[x[i]*z[i], {i, 1, n}] + Sum[x[i], {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*
Sum[x[i]*z[i], {i, 1, n}] + Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]*
Sum[y[i]*z[i], {i, 1, n}] - Sum[x[i], {i, 1, n}]*
Sum[x[i]*y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 - 2*Sum[x[i], {i, 1, n}]*
Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*
Sum[y[i]^2, {i, 1, n}])) +
((Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[x[i], {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[y[i], {i, 1, n}]^2*Sum[x[i]*z[i], {i, 1, n}] +
n*Sum[y[i]^2, {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] +
Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}] -
n*Sum[x[i]*y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])*x[i])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 -
2*Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*
Sum[y[i]^2, {i, 1, n}]) +
((-(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}]) +
Sum[x[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}] +
Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] -
n*Sum[x[i]*y[i], {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] -
Sum[x[i], {i, 1, n}]^2*Sum[y[i]*z[i], {i, 1, n}] +
n*Sum[x[i]^2, {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])*y[i])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 -
2*Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*
Sum[y[i]^2, {i, 1, n}]) + z[i]), {i, 1, n}];
ctest[n_] :=
Sum[-2*(-((Sum[x[i]*y[i], {i, 1, n}]^2*Sum[z[i], {i, 1, n}] -
Sum[x[i]^2, {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*
Sum[x[i]*z[i], {i, 1, n}] + Sum[x[i], {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*
Sum[x[i]*z[i], {i, 1, n}] + Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]*
Sum[y[i]*z[i], {i, 1, n}] - Sum[x[i], {i, 1, n}]*
Sum[x[i]*y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 - 2*Sum[x[i], {i, 1, n}]*
Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*
Sum[y[i]^2, {i, 1, n}])) +
((Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[x[i], {i, 1, n}]*Sum[y[i]^2, {i, 1, n}]*Sum[z[i], {i, 1, n}] -
Sum[y[i], {i, 1, n}]^2*Sum[x[i]*z[i], {i, 1, n}] +
n*Sum[y[i]^2, {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] +
Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}] -
n*Sum[x[i]*y[i], {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])*x[i])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 -
2*Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*
Sum[y[i]^2, {i, 1, n}]) +
((-(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}]) +
Sum[x[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}]*Sum[z[i], {i, 1, n}] +
Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] -
n*Sum[x[i]*y[i], {i, 1, n}]*Sum[x[i]*z[i], {i, 1, n}] -
Sum[x[i], {i, 1, n}]^2*Sum[y[i]*z[i], {i, 1, n}] +
n*Sum[x[i]^2, {i, 1, n}]*Sum[y[i]*z[i], {i, 1, n}])*y[i])/
(Sum[x[i]^2, {i, 1, n}]*Sum[y[i], {i, 1, n}]^2 -
2*Sum[x[i], {i, 1, n}]*Sum[y[i], {i, 1, n}]*Sum[x[i]*y[i], {i, 1, n}] +
n*Sum[x[i]*y[i], {i, 1, n}]^2 + Sum[x[i], {i, 1, n}]^2*
Sum[y[i]^2, {i, 1, n}] - n*Sum[x[i]^2, {i, 1, n}]*
Sum[y[i]^2, {i, 1, n}]) + z[i]), {i, 1, n}];
I'm claiming atest[n], btest[n], ctest[n] are identially 0 for all values of n>=3.
How far I've gotten on my own:
t = Table[Simplify[{atest[i], btest[i], ctest[i]}], {i,3,6}]
Out[1]= {{0, 0, 0}, {0, 0, 0}, {0, 0, 0}, {0, 0, 0}}
However, Simplify[atest[7]] just hangs. Even if I waited for it, I suspect Simplifying atest, btest, and ctest for larger numbers would take even longer.
I realize I'm asking Mathematica to do a lot, especially since I'm using symbols instead of actual numbers, but is there a good way to verify this identity for all n>=3 (or prove its false, of course).
For anyone interested, https://github.com/barrycarter/bcapps/blob/master/STACK/bc-solve-stats-196655.m has some notes re how I 'derived' this.
Answer
Here is my take at generating the identities you are trying to test; I will confess that I am not completely sure of what testing you need to carry out, so I hope that you will be able to compare the expressions obtained below with those you already have.
To set up the 3D linear regression problem I will use generic 3D points {x[i], y[i], z[i]}, and parameters $(a,b,c)$.
(* Generate the sum of squares to be minimized *)
Sum[Expand[(a x[i] + b y[i] + c - z[i])^2], {i, 1, n}]

(* Calculate the derivatives *)
Grad[%, {a, b, c}] // TableForm

(* Use linearity of sums *)
Distribute /@ % // TableForm

At this point, you will want to pull out the numerical and parameter constants from the summations to simplify the problem. Below is a set of very handy replacement rules that will help you do that, courtesy of Peltio who originally used them with Integrate, but they work with Sum right out of the box after swapping the right function name in:
outrules = {
Sum[f_ + g_, it : {x_Symbol, __}] :> Sum[f, it] + Sum[g, it],
Sum[c_ f_, it : {x_Symbol, __}] :> c Sum[f, it] /; FreeQ[c, x],
Sum[c_, it : {x_Symbol, __}] :> c Sum[1, it] /; FreeQ[c, x]
};
These must be applied repeatedly using ReplaceRepeated (//.) until the result doesn't change anymore:
(* Pull out any constants from summations *)
% //. outrules // TableForm

Now it's just a matter of setting the derivatives to zero to generate a system of three linear equations:
(* Set the derivatives equal to zero to generate the system of equations *)
Simplify[Thread[% == 0]] // TableForm

Now is a good time to simplify the notation a bit by introducing names for the summation results that act as constants in this system:
% //. {
Sum[x[i], {i, 1, n}] -> sumX, Sum[y[i], {i, 1, n}] -> sumY, Sum[z[i], {i, 1, n}] -> sumZ,
Sum[x[i]*y[i], {i, 1, n}] -> sumXY,
Sum[y[i]*z[i], {i, 1, n}] -> sumYZ,
Sum[x[i]*z[i], {i, 1, n}] -> sumXZ,
Sum[x[i]^2, {i, 1, n}] -> sumX2, Sum[y[i]^2, {i, 1, n}] -> sumY2
} // TableForm

Finally we can Solve this linear equation for $(a,b,c)$:
Solve[%, {a, b, c}]

Comments
Post a Comment