Pages

Wednesday 30 April 2014

Pythagorean Statistics

Suppose that $X_1$ and $X_2$ are independent random variables such that $X_1+X_2$ is defined.  Let $\sigma_1$ and $\sigma_2$, and $\sigma$ be the standard deviations of $X_1$, $X_2,$ and  $X_1+X_2$ respectively.  Then $\sigma^2=\sigma_1^2+\sigma_2^2$ (the numbers $\sigma_1^2,$ $\sigma_2^2$, and $\sigma^2$ are called the variances of $X_1,$ $X_2$, and $X_1+X_2$ respectively).  This looks very Pythagorean Theorem (PT) like.  In fact, at least one person calls it the Pythagorean Theorem of Statistics (PToS).

In the link, the author gives a proof of the PToS, but the proof doesn't look much like the proof for the PT.  Nevertheless, the similarity is hard to ignore.  So I wonder, is the PToS just the PT dressed up in statistical clothing, or is it merely a coincidence that the similarity exists?

I suspect it's the former, but I don't quite see the connection yet.  The PT is about right triangles in a plane, and I don't see what that plane would be for the PToS, nor what the triangle is, nor why the third side of that triangle should be related to $X_1+X_2$.  The other author doesn't seem to be aware of a connection either, since none of the reasons he gives for calling it the PToS are "because it is the Pythagorean Theorem."

Update:

My initial instinct was to represent $X_1$ and $X_2$ with a pair of orthogonal axes using $(\mu_1,\mu_2)$ as the origin, where $\mu_1$ and $\mu_2$ are the means of $X_1$ and $X_2$ respectively.  If we let $\arrow x_1=(\sigma_1,0)$ and $\arrow x_2=(0,\sigma_2)$, then we could represent $X_1+X_2$ with the line through $(\mu_1,\mu_2)$ with the direction vector  $\arrow x_1+\arrow x_2$.  The length of $\arrow x_1+\arrow x_2$ is $\sigma=\sqrt{\sigma_1^2+\sigma_2^2}.$  Therefore, $\sigma^2={\sigma_1^2+\sigma_2^2}.$  So we get the Pythagorean identity. 

This isn't a geometric proof of the Pythagorean Theorem of Statistics, though.  At best, it is an illustration of the Pythagorean Theorem of Statistics by the Pythagorean Theorem from geometry.  It is perhaps natural to represent $X_1$ and $X_2$ by orthogonal axes.  Representing $X_1+X_2$ by the line in the direction of $\arrow x_1+\arrow x_2$ was forced to make the geometry work.  It's not as natural.  The more natural thing to do is represent $X_1+X_2$ by a third axis orthogonal to both $X_1$ and $X_2$.  Also, I do not see how the statistical theorem would follow from the geometrical illustration.

No comments :

Post a Comment