R Programming/Linear Models
From Wikibooks, the open-content textbooks collection
Contents |
[edit] Basics
First generate a fake dataset:
N <- 10 u <- rnorm(N) x1 <- rnorm(N) x2 <- rnorm(N) y <- 1+ x1 + x2 + u
Estimate the model and store the result in the object called "fit":
fit<-lm(y~x1+x2)
Summarize the result using the summary() function:
> summary(fit)
Call:
lm(formula = y ~ x1 + x2)
Residuals:
Min 1Q Median 3Q Max
-1.5692 -0.2459 0.1852 0.3939 0.7577
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.2027 0.2998 4.012 0.005110 **
x1 0.8427 0.2785 3.026 0.019215 *
x2 1.5147 0.2008 7.543 0.000132 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.7377 on 7 degrees of freedom
Multiple R-Squared: 0.9103, Adjusted R-squared: 0.8847
F-statistic: 35.52 on 2 and 7 DF, p-value: 0.0002161
Get the estimated coefficients:
> fit$coeff
(Intercept) x1 x2
1.2026522 0.8427403 1.5146775
> output <- summary(fit)
> coef(output)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.1945847 0.2298888 5.196359 0.001258035
x1 0.6458170 0.3423214 1.886581 0.101182585
x2 0.6175165 0.2083628 2.963660 0.020995713
The vector of fitted values:
> fit$fitted
1 2 3 4 5 6
2.7683005 5.6184918 1.5745784 3.4995633 1.4830613 3.5007273
7 8 9 10
0.8894371 -2.1445262 0.9895569 2.8405984
The vector of residuals:
> fit$resid
1 2 3 4 5
0.42612425 0.40416901 0.19942084 -0.09466376 0.17099960
6 7 8 9 10
-0.29631716 0.75771491 -0.36131985 0.36311148 -1.56923931
The number of degrees of freedom:
> fit$df [1] 7
Information criteria:
> # Akaike Information Criteria > AIC(fit) [1] 26.72857 > # Bayesian Information Criteria > AIC(fit,k=log(N)) [1] 27.93891
The stats4 package includes AIC and BIC function:
> library(stats4) > ?BIC > lm1 <- lm(Fertility ~ . , data = swiss) > AIC(lm1) [1] 326.0716 > BIC(lm1) [1] 339.0226
[edit] Heteroskedasticity
[edit] Robustness
See Cook's distance
>library(car)
> cookd(fit)
1 2 3 4 5
0.0006205008 0.0643213760 0.2574810866 1.2128206779 0.2295047699
6 7 8 9 10
0.3130578329 0.0003365221 0.0671830241 0.0048474954 0.0714255871
Influence plot:
>influence.plot(fit)
Leverage plots:
leverage.plot(fit,term.name=x1) leverage.plot(fit,term.name=x2)
Bonferroni's outlier test:
> outlier.test(fit) max|rstudent| = 2.907674, degrees of freedom = 6, unadjusted p = 0.02706231, Bonferroni p = 0.2706231 Observation: 3
[edit] Instrumental Variables
- ivreg() in the AER package[1]
[edit] System of linear equations
[edit] References
- ↑ Christian Kleiber and Achim Zeileis (2008). Applied Econometrics with R. New York: Springer-Verlag. ISBN 978-0-387-77316-2. URL http://CRAN.R-project.org/package=AER