R Programming/Multinomial Models

From Wikibooks, open books for an open world
Jump to: navigation, search

Multinomial Logit[edit]

  • mlogit package.
  • multinom() nnet
  • multinom() VGAM


Conditional Logit[edit]

  • clogit() in the survival package
  • mclogit package.


Multinomial Probit[edit]

  • mprobit package [1]
  • MNP package to fit a multinomial probit.


Multinomial ordered logit model[edit]

We consider a multinomial ordered logit model with unkwnown thresholds. First, we simulate fake data. We draw the residuals in a logistic distribution. Then we draw some explanatory variable x and we define ys the latent variable as a linear function of x. Note that we set the constant to 0 because the constant and the thresholds cannot be identified simultanously in this model. So we need to fix one of the parameters. Then, we define thresholds (-1,0,1) and we define our observed variable y using the cut() function. So y is an ordered multinomial variable.

N <- 10000
u <- rlogis(N)
x <- rnorm(N)
ys <- x + u
mu <- c(-Inf,-1,0,1, Inf)
y <- cut(ys, mu)
plot(y,ys)
df <- data.frame(y,x)


Maximum likelihood estimation[edit]

This model can be estimated by maximum likelihood using the polr() function in the MASS package. Since it is not possible to achieve identification of the constant and the thresholds, R assumes by default that the constant is equal to 0.

library(MASS)
fit <- polr(y  ~ x, method = "logistic", data = df)
summary(fit)


Bayesian estimation[edit]

  • bayespolr() (arm) performs a bayesian estimation of the multinomial ordered logit
library("arm")
fit <- bayespolr(y ~ x, method = "logistic", data = df)
summary(fit)

Multinomial ordered probit model[edit]

We generate fake data by drawing an error term in normal distribution and cutting the latent variables in 4 categories.

N <- 1000
u <- rnorm(N)
x <- rnorm(N)
ys <- x + u
mu <- c(-Inf,-1,0,1, Inf)
y <- cut(ys, mu)
plot(y,ys)
df <- data.frame(x,y)


Maximum likelihood estimation[edit]

The model can be fitted using maximum likelihood method. This can be done using the polr() function in the MASS package with the probit method.

library(MASS)
fit <- polr(y  ~ x, method = "probit", data = df)
summary(fit)


Bayesian estimation[edit]

  • bayespolr() (arm) performs a bayesian estimation of the multinomial ordered probit


Rank Ordered Logit Model[edit]

This model was introduced in econometrics by Beggs, Cardell and Hausman in 1981[2] ·[3]. One application is the Combes et alii paper explaining the ranking of candidates to become professor[3]. Is is also known as Plackett–Luce model in biomedical literature or as exploded logit model in marketing[3].

Conditionally Ordered Hierarchical Probit[edit]

  • The Conditionally Ordered Hierarchical Probit can be estimated using the anchors package developped by Gary King and his coauthors[4].

References[edit]

  1. Harry Joe, Laing Wei Chou and Hongbin Zhang (2006). mprobit: Multivariate probit model for binary/ordinal response. R package version 0.9-2.
  2. Beggs, S., Cardell, S., Hausman, J., 1981. Assessing the potential demand for electric cars. Journal of Econometrics 17 (1), 1–19 (September).
  3. a b c Pierre-Philippe Combes, Laurent Linnemer, Michael Visser, Publish or peer-rich? The role of skills and networks in hiring economics professors, Labour Economics, Volume 15, Issue 3, June 2008, Pages 423-441, ISSN 0927-5371, 10.1016/j.labeco.2007.04.003. (http://www.sciencedirect.com/science/article/pii/S0927537107000413)
  4. Jonathan Wand, Gary King, Olivia Lau (2009). anchors: Software for Anchoring Vignette Data. Journal of Statistical Software, Forthcoming. URL http://www.jstatsoft.org/.