Econometric Theory/Multiple Regression Analysis

From Wikibooks, open books for an open world
< Econometric Theory
Jump to: navigation, search

Our first regressions (MLE and OLS) were bivariate. Our lines were simple, two variable lines. However, in most economic data, there are a multitude of possible independent things that can effect a dependent variable. So we can expand our explanatory functions to allow multiple independent variables.

Instead of our functions looking like Y_=\alpha + \beta X_i + \epsilon_i, our functions look like Y_i = \beta_0 + \beta_1 X_{1,i} + \beta_2 X_{2,i} + \cdots + \beta_n X_{n,i} + \epsilon_i

By adding more variables and data to our model, we can hopefully get a better fit and understanding of the dependent variable. However, with the added variables come added problems that will misguide our model.

Goodness of Fit[edit]

When we move to the multiple regression case, our goodness of fit looks much like it previously did in the bivariate case. TSS = ESS + RSS. Our R² = ESS/TSS) = 1 - (RSS/TSS). We can still use our Coefficient of Determination, R² (R² = ESS/TSS = 1 - (RSS/TSS)), but there is a problem associated with it. R² will never decrease because of an addition of a variable, whether or not it helps us explain our dependent variable. When we add a new variable to the function, the ESS is calculated over a larger set of variables, and ESS will be greater than or equal to what we had before. This will cause our R² to increase, even if the addition of our new variable hurts our model. There is a tool to fix this problem. R² is replaced with Adjusted R² which adjusts it for the added degrees of freedom. Adjusted R² is signified by adding a bar above the 'R.'

 \bar{R^2} = 1 - (\frac{\hat{var(\epsilon)}} {\hat{var(Y)}})