# R Programming/Estimation utilities

This page deals with methods which are available for most estimation commands. This can be useful for all kind of regression models.

## Formulas[edit | edit source]

Most estimation commands use a formula interface. The outcome is left of the `~`

and the covariates are on the right.

`y ~ x1 + x2`

It is easy to include multinomial variable as predictive variables in a model. If the variable is not already a factor, one just need to use the `as.factor()`

function. This will create a set of dummy variables.

`y ~ as.factor(x)`

For instance, we can use the Star data in the **Ecdat** package :

```
library("Ecdat")
data(Star)
summary(lm(tmathssk ~ as.factor(classk), data = Star))
```

`I()`

takes arguments "as is". For instance, if you want to include in your equation a modified variable such as a squarred term or the addition of two variables, you may use `I()`

.

```
lm(y ~ x1 + I(x1^2) + x2)
lm(y ~ I(x1 + x2))
lm(I(y-100) ~ I(x1-100) + I(x2 - 100))
```

It is easy to include interaction between variables by using `:`

or `*`

. `:`

adds all interaction terms whereas `*`

adds interaction terms and individual terms.

```
lm(y~x1:x2) # interaction term only
lm(y~x1*x2) # interaction and individual terms
```

It is also possible to generate polynomials using the `poly()`

function with option `raw = TRUE`

.

`lm(y ~ poly(x, degree = 3, raw = TRUE))`

There is also an advanced formula interface which is useful for instrumental variables models and mixed models. For instance `ivreg()`

(**AER**) uses this advanced formulas interface. The instrumental variables are entered after the `|`

. See the Instrumental Variables section if you want to learn more.

```
library("AER")
ivreg(y ~ x | z)
```

## Output[edit | edit source]

In addition to the `summary()`

and `print()`

functions which display the output for most estimation commands, some authors have developed simplified output functions. One of them is the `display()`

function in the **arm** package. Another one is the `coefplot()`

in the **arm** package which displays the coefficients with confidence intervals in a plot. According to the standards defined by Nathaniel Beck^{[1]}, Jeff Gill developped `graph.summary()`

^{[2]}. This command does not show useless auxiliary statistics.

R code | Output |
---|---|

```
source("http://artsci.wustl.edu/~jgill/Models/graph.summary.R")
N <- 1000
u <- rnorm(N)
x1 <- 1 + rnorm(N)
x2 <- 1 + rnorm(N) + x1
y <- 1 + x1 + x2 + u
graph.summary(lm(y ~ x1 + x2))
``` |
```
Family: gaussian
Link function: identity
Coef Std.Err. 0.95 Lower 0.95 Upper CIs:ZE+RO
(Intercept) 0.980 0.056 0.871 1.089 |o|
x1 1.040 0.043 0.955 1.125 |o|
x2 0.984 0.031 0.923 1.045 |o|
N: 1000 Estimate of Sigma: 0.998
``` |

```
library("arm")
display(lm(y ~ x1 + x2))
``` |
```
lm(formula = y ~ x1 + x2)
coef.est coef.se
(Intercept) 0.89 0.05
x1 1.05 0.04
x2 1.02 0.03
---
n = 1000, k = 3
residual sd = 0.96, R-Squared = 0.86
``` |

## Weights[edit | edit source]

This section is a stub.You can help Wikibooks by expanding it. |

## Tests[edit | edit source]

This section is a stub.You can help Wikibooks by expanding it. |

## Confidence intervals[edit | edit source]

This section is a stub.You can help Wikibooks by expanding it. |

## Delta Method[edit | edit source]

- If you want to know the standard error of a transformation of one of your parameter, you need to use the
**delta method** `deltamethod()`

in the**msm**package^{[3]}.`delta.method()`

in the**alr3**package.`deltaMethod`

in the**car**package.

## Zelig : the pseudo-bootstrap method[edit | edit source]

**Zelig**^{[4]} is a postestimation package which simulates in the distribution of the estimated parameters and computes the quantities of interest such as marginal effects or predicted probabilities. This is especially useful for non-linear models. **Zelig** comes with a set of *vignettes* which explain how to deal with each kind of model.
There are three commands.

`zelig()`

estimates the model and draws from the distribution of estimated parameters.`setx()`

fixes the values of explanatory variables.`sim()`

computes the quantities of interest.

## References[edit | edit source]

- ↑ Nathaniel Beck "Making regression and related output more helpful to users" The Political Methodologist 2010 http://politics.as.nyu.edu/docs/IO/2576/beck_tpm_edited.pdf
- ↑ Jeff Gill
`graph.summary()`

http://artsci.wustl.edu/~jgill/Models/graph.summary.s - ↑ See the example on the UCLA Statistics webpage : http://www.ats.ucla.edu/stat/r/faq/deltamethod.htm
- ↑ Kosuke Imai, Gary King and Olivia Lau (2009). Zelig: Everyone's Statistical Software. R package version 3.4-5. http://CRAN.R-project.org/package=Zelig