R Programming/Estimation utilities

This page deals with methods which are available for most estimation commands. This can be useful for all kind of regression models.

Formulas

Most estimation commands use a formula interface. The outcome is left of the `~` and the covariates are on the right.

```y ~ x1 + x2
```

It is easy to include multinomial variable as predictive variables in a model. If the variable is not already a factor, one just need to use the `as.factor()` function. This will create a set of dummy variables.

```y ~ as.factor(x)
```

For instance, we can use the Star data in the Ecdat package :

```library("Ecdat")
data(Star)
summary(lm(tmathssk ~ as.factor(classk), data = Star))
```

`I()` takes arguments "as is". For instance, if you want to include in your equation a modified variable such as a squarred term or the addition of two variables, you may use `I()`.

```lm(y ~ x1 + I(x1^2) + x2)
lm(y ~ I(x1 + x2))
lm(I(y-100) ~ I(x1-100) + I(x2 - 100))
```

It is easy to include interaction between variables by using `:` or `*`. `:` adds all interaction terms whereas `*` adds interaction terms and individual terms.

```lm(y~x1:x2) # interaction term only
lm(y~x1*x2) # interaction and individual terms
```

It is also possible to generate polynomials using the `poly()` function with option `raw = TRUE`.

```lm(y ~ poly(x, degree = 3, raw = TRUE))
```

There is also an advanced formula interface which is useful for instrumental variables models and mixed models. For instance `ivreg()` (AER) uses this advanced formulas interface. The instrumental variables are entered after the `|`. See the Instrumental Variables section if you want to learn more.

```library("AER")
ivreg(y ~ x | z)
```

Output

In addition to the `summary()` and `print()` functions which display the output for most estimation commands, some authors have developed simplified output functions. One of them is the `display()` function in the arm package. Another one is the `coefplot()` in the arm package which displays the coefficients with confidence intervals in a plot. According to the standards defined by Nathaniel Beck[1], Jeff Gill developped `graph.summary()`[2]. This command does not show useless auxiliary statistics.

R code Output
```source("http://artsci.wustl.edu/~jgill/Models/graph.summary.R")
N <- 1000
u <- rnorm(N)
x1 <- 1 + rnorm(N)
x2 <- 1 + rnorm(N) + x1
y <- 1 + x1 + x2 + u
graph.summary(lm(y ~ x1 + x2))
```
```Family: gaussian

Coef Std.Err. 0.95 Lower 0.95 Upper CIs:ZE+RO
(Intercept) 0.980    0.056      0.871      1.089      |o|
x1          1.040    0.043      0.955      1.125      |o|
x2          0.984    0.031      0.923      1.045      |o|

N: 1000    Estimate of Sigma: 0.998
```
```library("arm")
display(lm(y ~ x1 + x2))
```
```lm(formula = y ~ x1 + x2)
coef.est coef.se
(Intercept) 0.89     0.05
x1          1.05     0.04
x2          1.02     0.03
---
n = 1000, k = 3
residual sd = 0.96, R-Squared = 0.86
```

Delta Method

• If you want to know the standard error of a transformation of one of your parameter, you need to use the delta method
• `deltamethod()` in the msm package[3].
• `delta.method()` in the alr3 package.
• `deltaMethod` in the car package.

Zelig : the pseudo-bootstrap method

Zelig[4] is a postestimation package which simulates in the distribution of the estimated parameters and computes the quantities of interest such as marginal effects or predicted probabilities. This is especially useful for non-linear models. Zelig comes with a set of vignettes which explain how to deal with each kind of model. There are three commands.

• `zelig()` estimates the model and draws from the distribution of estimated parameters.
• `setx()` fixes the values of explanatory variables.
• `sim()` computes the quantities of interest.

References

1. Nathaniel Beck "Making regression and related output more helpful to users" The Political Methodologist 2010 http://politics.as.nyu.edu/docs/IO/2576/beck_tpm_edited.pdf
2. Jeff Gill `graph.summary()` http://artsci.wustl.edu/~jgill/Models/graph.summary.s
3. See the example on the UCLA Statistics webpage : http://www.ats.ucla.edu/stat/r/faq/deltamethod.htm
4. Kosuke Imai, Gary King and Olivia Lau (2009). Zelig: Everyone's Statistical Software. R package version 3.4-5. http://CRAN.R-project.org/package=Zelig