Deriving Regression coefficient from Residual Sum of Squares
When Linear Regression equation, \(y=\hat{\alpha} + \hat{\beta} x\), is found from the data, the formula for their coefficients, \(\hat{\alpha}\) and \(\hat{\beta}\), are the followings.
$$ \hat{\beta} = r_{xy}\frac{s_{y}}{s_{x}} $$
$$ \hat{\alpha} = \bar{y} – \hat{\beta}\bar{x_i} $$
\(r_{xy} \): correlation coefficient
\(s_{x} \): standard deviation of \(x\), \(s_{y} \): standard deviation of \(y\)
Here we will explain how to derive these equation from the residual sum of squares.
Derivation
We will use the following residual sum of regression below.
$$ S(\hat{\alpha}, \hat{\beta}) = \sum^{n}_{i=1}(y_i – \hat{y})^2 = \sum^{n}_{i=1}(y_i – (\hat{\alpha} + \hat{\beta} x_i ))^2 $$
To find \(\hat{\alpha}\) and \(\hat{\beta}\) to minimize \(S(\hat{\alpha}, \hat{\beta})\), we will differentiate partially with \(\hat{\alpha}, \hat{\beta}\).
$$ \frac{\partial S}{\partial \hat{\alpha}} = 2\times(-1)\times\sum^{n}_{i=1}(y_i – \hat{\alpha} – \hat{\beta} x_i ) =0 \\ \sum^{n}_{i=1}y_i = \sum^{n}_{i=1}\hat{\alpha} + \sum^{n}_{i=1}\hat{\beta}x_i \\ = n\hat{\alpha} + \hat{\beta}\sum^{n}_{i=1}x_i$$
Dividing both side by \(n\)
$$ \frac{\sum^{n}_{i=1}y_i}{n} = \hat{\alpha} + \frac{\sum^{n}_{i=1}x_i}{n} \ \bar{y} = \hat{\alpha} + \hat{\beta}\bar{x} \\ \hat{\alpha} = \bar{y} – \hat{\beta}\bar{x}$$
we can get \(\hat{\alpha}\).
Now to get \(\hat{\beta}\), we will differentiate \(S\) partially by \(\hat{\beta}\).
$$ \frac{\partial S}{\partial \hat{\beta}} = 2\times(-1)\times\sum^{n}_{i=1}(y_i – \hat{\alpha} – \hat{\beta} x_i ) x_i =0 \\ \sum^{n}_{i=1}x_i y_i = \sum^{n}_{i=1}\hat{\alpha} x_i + \sum^{n}_{i=1}\hat{\beta} x_i^2 \ \sum^{n}_{i=1}x_i y_i= \hat{\alpha}\sum^{n}_{i=1}x_i + \hat{\beta}\sum^{n}_{i=1}x_i^2$$
Dividing both sides by \(n\)
$$ \frac{\sum^{n}_{i=1}x_iy_i}{n} = \hat{\alpha}\frac{\sum^{n}_{i=1}x_i}{n} + \hat{\beta}{\frac{\sum^{n}_{i=1}x_i^2}{n}} \\ \frac{\sum^{n}_{i=1}x_iy_i}{n} = \hat{\alpha}\bar{x} + \hat{\beta}{\frac{\sum^{n}_{i=1}x_i^2}{n}} $$
Here we can use \(\hat{\alpha} = \bar{y} – \hat{\beta}\bar{x}\) that we already got.
$$ \frac{\sum^{n}_{i=1}x_iy_i}{n} = \bar{x}\bar{y} – \hat{\beta}\bar{x}^2 + \hat{\beta}{\frac{\sum^{n}_{i=1}x_i^2}{n}} \\ = \bar{x}\bar{y} + \hat{\beta}(\frac{\sum^{n}_{i=1}x_i^2}{n}-\bar{x}^2)$$
\( \frac{\sum^{n}_{i=1}x_i^2}{n}-\bar{x}^2\) is variance of \(x\) then we define it as \(s_{xx}\).
$$\frac{\sum^{n}_{i=1}x_iy_i}{n} = \bar{x}\bar{y} + \hat{\beta}s_{xx} \\ \frac{\sum^{n}_{i=1}x_iy_i }{n}- \bar{x}\bar{y} = \hat{\beta}s_{xx}$$
Here we can change \(\frac{\sum^{n}_{i=1}x_iy_i}{n} – \bar{x}\bar{y}\) to the following.
$$ \frac{\sum^{n}_{i=1}x_iy_i}{n} + \bar{x}\bar{y} – 2\bar{x}\bar{y} \\ = \frac{\sum^{n}_{i=1}x_iy_i}{n} + \frac{\sum^{n}_{i=1}\bar{x}\bar{y}}{n} – \frac{\sum^{n}_{i=1}x_i}{n}\bar{y} – \frac{\sum^{n}_{i=1}y_i}{n}\bar{x} = \frac{\sum^{n}_{i=1}(x_iy_i – x_i\bar{y} – \bar{x}y_i + \bar{x}\bar{y})}{n}$$
\(\frac{\sum^{n}_{i=1}x_iy_i}{n}- \bar{x}\bar{y}\) is covariance of \(x, y\) then we define it as \(s_{xy}\).
$$s_{xy} = \hat{\beta}s_{xx} \\ \hat{\beta} = \frac{s_{xy}}{s_{xx}} = \frac{s_{xy}}{s_xs_y}\cdot \frac{s_y}{s_x} $$
We express correlation coefficient as \(r_{xy} = \frac{s_{xy}}{s_{x}s_{y}}\)
$$ \hat{\beta} = r_{xy}\frac{s_y}{s_x}$$
Finally we can get \(\hat{\alpha}\) and \(\hat{\beta}\).
$$ \hat{\beta} = r_{xy}\frac{s_{y}}{s_{x}} $$
$$ \hat{\alpha} = \bar{y} – \hat{\beta}\bar{x_i} $$