Deriving Regression coefficient from Residual Sum of Squares
When Linear Regression equation, y=\hat{\alpha} + \hat{\beta} x, is found from the data, the formula for their coefficients, \hat{\alpha} and \hat{\beta}, are the followings.
\hat{\beta} = r_{xy}\frac{s_{y}}{s_{x}}
\hat{\alpha} = \bar{y} – \hat{\beta}\bar{x_i}
r_{xy} : correlation coefficient
s_{x} : standard deviation of x, s_{y} : standard deviation of y
Here we will explain how to derive these equation from the residual sum of squares.
Derivation
We will use the following residual sum of regression below.
S(\hat{\alpha}, \hat{\beta}) = \sum^{n}_{i=1}(y_i – \hat{y})^2 = \sum^{n}_{i=1}(y_i – (\hat{\alpha} + \hat{\beta} x_i ))^2
To find \hat{\alpha} and \hat{\beta} to minimize S(\hat{\alpha}, \hat{\beta}), we will differentiate partially with \hat{\alpha}, \hat{\beta}.
\frac{\partial S}{\partial \hat{\alpha}} = 2\times(-1)\times\sum^{n}_{i=1}(y_i – \hat{\alpha} – \hat{\beta} x_i ) =0 \\ \sum^{n}_{i=1}y_i = \sum^{n}_{i=1}\hat{\alpha} + \sum^{n}_{i=1}\hat{\beta}x_i \\ = n\hat{\alpha} + \hat{\beta}\sum^{n}_{i=1}x_i
Dividing both side by n
\frac{\sum^{n}_{i=1}y_i}{n} = \hat{\alpha} + \frac{\sum^{n}_{i=1}x_i}{n} \ \bar{y} = \hat{\alpha} + \hat{\beta}\bar{x} \\ \hat{\alpha} = \bar{y} – \hat{\beta}\bar{x}
we can get \hat{\alpha}.
Now to get \hat{\beta}, we will differentiate S partially by \hat{\beta}.
\frac{\partial S}{\partial \hat{\beta}} = 2\times(-1)\times\sum^{n}_{i=1}(y_i – \hat{\alpha} – \hat{\beta} x_i ) x_i =0 \\ \sum^{n}_{i=1}x_i y_i = \sum^{n}_{i=1}\hat{\alpha} x_i + \sum^{n}_{i=1}\hat{\beta} x_i^2 \ \sum^{n}_{i=1}x_i y_i= \hat{\alpha}\sum^{n}_{i=1}x_i + \hat{\beta}\sum^{n}_{i=1}x_i^2
Dividing both sides by n
\frac{\sum^{n}_{i=1}x_iy_i}{n} = \hat{\alpha}\frac{\sum^{n}_{i=1}x_i}{n} + \hat{\beta}{\frac{\sum^{n}_{i=1}x_i^2}{n}} \\ \frac{\sum^{n}_{i=1}x_iy_i}{n} = \hat{\alpha}\bar{x} + \hat{\beta}{\frac{\sum^{n}_{i=1}x_i^2}{n}}
Here we can use \hat{\alpha} = \bar{y} – \hat{\beta}\bar{x} that we already got.
\frac{\sum^{n}_{i=1}x_iy_i}{n} = \bar{x}\bar{y} – \hat{\beta}\bar{x}^2 + \hat{\beta}{\frac{\sum^{n}_{i=1}x_i^2}{n}} \\ = \bar{x}\bar{y} + \hat{\beta}(\frac{\sum^{n}_{i=1}x_i^2}{n}-\bar{x}^2)
\frac{\sum^{n}_{i=1}x_i^2}{n}-\bar{x}^2 is variance of x then we define it as s_{xx}.
\frac{\sum^{n}_{i=1}x_iy_i}{n} = \bar{x}\bar{y} + \hat{\beta}s_{xx} \\ \frac{\sum^{n}_{i=1}x_iy_i }{n}- \bar{x}\bar{y} = \hat{\beta}s_{xx}
Here we can change \frac{\sum^{n}_{i=1}x_iy_i}{n} – \bar{x}\bar{y} to the following.
\frac{\sum^{n}_{i=1}x_iy_i}{n} + \bar{x}\bar{y} – 2\bar{x}\bar{y} \\ = \frac{\sum^{n}_{i=1}x_iy_i}{n} + \frac{\sum^{n}_{i=1}\bar{x}\bar{y}}{n} – \frac{\sum^{n}_{i=1}x_i}{n}\bar{y} – \frac{\sum^{n}_{i=1}y_i}{n}\bar{x} = \frac{\sum^{n}_{i=1}(x_iy_i – x_i\bar{y} – \bar{x}y_i + \bar{x}\bar{y})}{n}
\frac{\sum^{n}_{i=1}x_iy_i}{n}- \bar{x}\bar{y} is covariance of x, y then we define it as s_{xy}.
s_{xy} = \hat{\beta}s_{xx} \\ \hat{\beta} = \frac{s_{xy}}{s_{xx}} = \frac{s_{xy}}{s_xs_y}\cdot \frac{s_y}{s_x}
We express correlation coefficient as r_{xy} = \frac{s_{xy}}{s_{x}s_{y}}
\hat{\beta} = r_{xy}\frac{s_y}{s_x}
Finally we can get \hat{\alpha} and \hat{\beta}.
\hat{\beta} = r_{xy}\frac{s_{y}}{s_{x}}
\hat{\alpha} = \bar{y} – \hat{\beta}\bar{x_i}