# R Programming/Estimation utilities

This page deals with methods which are available for most estimation commands. This can be useful for all kind of regression models.

## Formulas[edit | edit source]

Most estimation commands use a formula interface. The outcome is left of the `~`

and the covariates are on the right.

```
y ~ x1 + x2
```

It is easy to include multinomial variable as predictive variables in a model. If the variable is not already a factor, one just need to use the `as.factor()`

function. This will create a set of dummy variables.

```
y ~ as.factor(x)
```

For instance, we can use the Star data in the **Ecdat** package :

```
library("Ecdat")
data(Star)
summary(lm(tmathssk ~ as.factor(classk), data = Star))
```

`I()`

takes arguments "as is". For instance, if you want to include in your equation a modified variable such as a squarred term or the addition of two variables, you may use `I()`

.

```
lm(y ~ x1 + I(x1^2) + x2)
lm(y ~ I(x1 + x2))
lm(I(y-100) ~ I(x1-100) + I(x2 - 100))
```

It is easy to include interaction between variables by using `:`

or `*`

. `:`

adds all interaction terms whereas `*`

adds interaction terms and individual terms.

```
lm(y~x1:x2) # interaction term only
lm(y~x1*x2) # interaction and individual terms
```

It is also possible to generate polynomials using the `poly() function with option `

`raw = TRUE`

.

lm(y ~ poly(x, degree = 3, raw = TRUE))

There is also an advanced formula interface which is useful for instrumental variables models and mixed models. For instance `ivreg()`

(**AER**) uses this advanced formulas interface. The instrumental variables are entered after the `|`

. See the Instrumental Variables section if you want to learn more.

library("AER")
ivreg(y ~ x | z)

## Output[edit | edit source]

In addition to the `summary()`

and `print()`

functions which display the output for most estimation commands, some authors have developed simplified output functions. One of them is the `display()`

function in the **arm** package. Another one is the `coefplot()`

in the **arm** package which displays the coefficients with confidence intervals in a plot. According to the standards defined by Nathaniel Beck^{[1]}, Jeff Gill developped `graph.summary()`

^{[2]}. This command does not show useless auxiliary statistics.

R code
Output
source("http://artsci.wustl.edu/~jgill/Models/graph.summary.R")
N <- 1000
u <- rnorm(N)
x1 <- 1 + rnorm(N)
x2 <- 1 + rnorm(N) + x1
y <- 1 + x1 + x2 + u
graph.summary(lm(y ~ x1 + x2))

Family: gaussian
Link function: identity
Coef Std.Err. 0.95 Lower 0.95 Upper CIs:ZE+RO
(Intercept) 0.980 0.056 0.871 1.089 |o|
x1 1.040 0.043 0.955 1.125 |o|
x2 0.984 0.031 0.923 1.045 |o|
N: 1000 Estimate of Sigma: 0.998

library("arm")
display(lm(y ~ x1 + x2))

lm(formula = y ~ x1 + x2)
coef.est coef.se
(Intercept) 0.89 0.05
x1 1.05 0.04
x2 1.02 0.03
---
n = 1000, k = 3
residual sd = 0.96, R-Squared = 0.86

## Weights[edit | edit source]

**This section is a stub.**

You can help Wikibooks by expanding it.

## Tests[edit | edit source]

**This section is a stub.**

You can help Wikibooks by expanding it.

## Confidence intervals[edit | edit source]

**This section is a stub.**

You can help Wikibooks by expanding it.

## Delta Method[edit | edit source]

- If you want to know the standard error of a transformation of one of your parameter, you need to use the
**delta method**
`deltamethod()`

in the **msm** package^{[3]}.
`delta.method()`

in the **alr3** package.
`deltaMethod`

in the **car** package.

## Zelig : the pseudo-bootstrap method[edit | edit source]

**Zelig**^{[4]} is a postestimation package which simulates in the distribution of the estimated parameters and computes the quantities of interest such as marginal effects or predicted probabilities. This is especially useful for non-linear models. **Zelig** comes with a set of *vignettes* which explain how to deal with each kind of model.
There are three commands.

`zelig()`

estimates the model and draws from the distribution of estimated parameters.
`setx()`

fixes the values of explanatory variables.
`sim()`

computes the quantities of interest.

## References[edit | edit source]

```
```
- ↑ Nathaniel Beck "Making regression and related output more helpful to users" The Political Methodologist 2010 http://politics.as.nyu.edu/docs/IO/2576/beck_tpm_edited.pdf
- ↑ Jeff Gill
`graph.summary()`

http://artsci.wustl.edu/~jgill/Models/graph.summary.s
- ↑ See the example on the UCLA Statistics webpage : http://www.ats.ucla.edu/stat/r/faq/deltamethod.htm
- ↑ Kosuke Imai, Gary King and Olivia Lau (2009). Zelig: Everyone's Statistical Software. R package version 3.4-5.
http://CRAN.R-project.org/package=Zelig