R Programming/Multinomial Models

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Multinomial Logit[edit | edit source]

  • mlogit package.
  • mnlogit package
  • Bayesm package
  • multinom() nnet
  • multinomial(), which is used by vglm() VGAM

Conditional Logit[edit | edit source]

  • clogit() in the survival package
  • mclogit package.


Multinomial Probit[edit | edit source]

  • mprobit package [1]
  • MNP package to fit a multinomial probit.


Multinomial ordered logit model[edit | edit source]

We consider a multinomial ordered logit model with unknown thresholds. First, we simulate fake data. We draw the residuals in a logistic distribution. Then we draw some explanatory variable x and we define ys the latent variable as a linear function of x. Note that we set the constant to 0 because the constant and the thresholds cannot be identified simultaneously in this model. So we need to fix one of the parameters. Then, we define thresholds (-1,0,1) and we define our observed variable y using the cut() function. So y is an ordered multinomial variable.

N <- 10000
u <- rlogis(N)
x <- rnorm(N)
ys <- x + u
mu <- c(-Inf,-1,0,1, Inf)
y <- cut(ys, mu)
plot(y,ys)
df <- data.frame(y,x)


Maximum likelihood estimation[edit | edit source]

This model can be estimated by maximum likelihood using the polr() function in the MASS package. Since it is not possible to achieve identification of the constant and the thresholds, R assumes by default that the constant is equal to 0.

library(MASS)
fit <- polr(y  ~ x, method = "logistic", data = df)
summary(fit)


Bayesian estimation[edit | edit source]

  • bayespolr() (arm) performs a bayesian estimation of the multinomial ordered logit
library("arm")
fit <- bayespolr(y ~ x, method = "logistic", data = df)
summary(fit)

Multinomial ordered probit model[edit | edit source]

We generate fake data by drawing an error term in normal distribution and cutting the latent variables in 4 categories.

N <- 1000
u <- rnorm(N)
x <- rnorm(N)
ys <- x + u
mu <- c(-Inf,-1,0,1, Inf)
y <- cut(ys, mu)
plot(y,ys)
df <- data.frame(x,y)


Maximum likelihood estimation[edit | edit source]

The model can be fitted using maximum likelihood method. This can be done using the polr() function in the MASS package with the probit method.

library(MASS)
fit <- polr(y  ~ x, method = "probit", data = df)
summary(fit)


Bayesian estimation[edit | edit source]

  • bayespolr() (arm) performs a bayesian estimation of the multinomial ordered probit


Rank Ordered Logit Model[edit | edit source]

This model was introduced in econometrics by Beggs, Cardell and Hausman in 1981.[2][3] One application is the Combes et alii paper explaining the ranking of candidates to become professor.[3] Is is also known as Plackett–Luce model in biomedical literature or as exploded logit model in marketing.[3]

Conditionally Ordered Hierarchical Probit[edit | edit source]

  • The Conditionally Ordered Hierarchical Probit can be estimated using the anchors package developped by Gary King and his coauthors[4].

References[edit | edit source]

  1. Harry Joe, Laing Wei Chou and Hongbin Zhang (2006). mprobit: Multivariate probit model for binary/ordinal response. R package version 0.9-2.
  2. Beggs, S; Cardell, S; Hausman, J (1981). "Assessing the potential demand for electric cars". Journal of Econometrics. 17: 1–19. doi:10.1016/0304-4076(81)90056-7.
  3. a b c Combes, Pierre-Philippe; Linnemer, Laurent; Visser, Michael (2008). "Publish or peer-rich? The role of skills and networks in hiring economics professors". Labour Economics. 15 (3): 423–41. doi:10.1016/j.labeco.2007.04.003.
  4. Jonathan Wand, Gary King, Olivia Lau (2009). anchors: Software for Anchoring Vignette Data. Journal of Statistical Software, Forthcoming. URL http://www.jstatsoft.org/.