R Programming/Multinomial Models
- 1 Multinomial Logit
- 2 Conditional Logit
- 3 Multinomial Probit
- 4 Multinomial ordered logit model
- 5 Multinomial ordered probit model
- 6 Rank Ordered Logit Model
- 7 Conditionally Ordered Hierarchical Probit
- 8 References
- mlogit package.
- mnlogit package
- Bayesm package
- multinom() nnet
- multinomial(), which is used by vglm() VGAM
- clogit() in the survival package
- mclogit package.
Multinomial ordered logit model
We consider a multinomial ordered logit model with unkwnown thresholds.
First, we simulate fake data. We draw the residuals in a logistic distribution. Then we draw some explanatory variable x and we define ys the latent variable as a linear function of x. Note that we set the constant to 0 because the constant and the thresholds cannot be identified simultanously in this model. So we need to fix one of the parameters. Then, we define thresholds (-1,0,1) and we define our observed variable y using the
cut() function. So y is an ordered multinomial variable.
N <- 10000 u <- rlogis(N) x <- rnorm(N) ys <- x + u mu <- c(-Inf,-1,0,1, Inf) y <- cut(ys, mu) plot(y,ys) df <- data.frame(y,x)
Maximum likelihood estimation
This model can be estimated by maximum likelihood using the
polr() function in the MASS package. Since it is not possible to achieve identification of the constant and the thresholds, R assumes by default that the constant is equal to 0.
library(MASS) fit <- polr(y ~ x, method = "logistic", data = df) summary(fit)
- bayespolr() (arm) performs a bayesian estimation of the multinomial ordered logit
library("arm") fit <- bayespolr(y ~ x, method = "logistic", data = df) summary(fit)
Multinomial ordered probit model
We generate fake data by drawing an error term in normal distribution and cutting the latent variables in 4 categories.
N <- 1000 u <- rnorm(N) x <- rnorm(N) ys <- x + u mu <- c(-Inf,-1,0,1, Inf) y <- cut(ys, mu) plot(y,ys) df <- data.frame(x,y)
Maximum likelihood estimation
The model can be fitted using maximum likelihood method. This can be done using the
polr() function in the MASS package with the
library(MASS) fit <- polr(y ~ x, method = "probit", data = df) summary(fit)
- bayespolr() (arm) performs a bayesian estimation of the multinomial ordered probit
Rank Ordered Logit Model
This model was introduced in econometrics by Beggs, Cardell and Hausman in 1981. One application is the Combes et alii paper explaining the ranking of candidates to become professor. Is is also known as Plackett–Luce model in biomedical literature or as exploded logit model in marketing.
Conditionally Ordered Hierarchical Probit
- The Conditionally Ordered Hierarchical Probit can be estimated using the anchors package developped by Gary King and his coauthors.
- Harry Joe, Laing Wei Chou and Hongbin Zhang (2006). mprobit: Multivariate probit model for binary/ordinal response. R package version 0.9-2.
- Beggs, S; Cardell, S; Hausman, J (1981). "Assessing the potential demand for electric cars". Journal of Econometrics 17: 1–19. doi:10.1016/0304-4076(81)90056-7.
- Combes, Pierre-Philippe; Linnemer, Laurent; Visser, Michael (2008). "Publish or peer-rich? The role of skills and networks in hiring economics professors". Labour Economics 15 (3): 423–41. doi:10.1016/j.labeco.2007.04.003.
- Jonathan Wand, Gary King, Olivia Lau (2009). anchors: Software for Anchoring Vignette Data. Journal of Statistical Software, Forthcoming. URL http://www.jstatsoft.org/.