IT Learning

実践形式でITのお勉強

「 Statistics 」 一覧

Deriving Regression coefficient from Residual Sum of Squares

2022/09/04   -Statistics

When Linear Regression equation, \(y=\hat{\alpha} + \hat{\beta} x\), is found from the data, the formula for their coefficients, \(\hat{\alpha}\) and \(\hat{\beta}\), are the followings. $$ \hat{\beta} = r_{xy}\frac{s_{y}}{s_{x}} $$ $$ \hat{\alpha} = \bar{y} – \hat{\beta}\bar{x_i} $$ \(r_{xy} \): correlation coefficient \(s_{x} \): standard deviation of \(x\), \(s_{y} \): standard deviation of \(y\) Here we will explain how to derive these equation from the residual sum of squares. Derivation We will use the following residual sum of regression below. $$ S(\hat{\alpha}, \hat{\beta}) = \sum^{n}_{i=1}(y_i – \hat{y})^2 = \sum^{n}_{i=1}(y_i – (\hat{\alpha} + \hat{\beta} x_i ))^2 $$ To find \(\hat{\alpha}\) and \(\hat{\beta}\) to …

Multiple Regression Analysis by Python statsmodels

2022/09/03   -Python, Statistics

Overview In this article, there is a explanation of Multiple Regression Analysis by using statsmodels in python. We focus not analyzing but understanding how to use this library. Environments Python 3.8.6statsmodels 0.13.2 Preparetion of datase This time we will use Red Wine Quality dataset. This dataset is published as OpenDatabase License in kaggle site. We download it and save the csv file in any directory. Red Wine Quality | Kaggle Installing statsmodels By using pip command, we will install the statsmodels library. pip install statsmodels Loading dataset Next, we will load the dataset you save in any directory by pandas …