# R Programming/Maximum Likelihood

## Introduction

Maximum likelihood estimation is just an optimization problem. You have to write down your log likelihood function and use some optimization technique. Sometimes you also need to write your score (the first derivative of the log likelihood) and or the hessian (the second derivative of the log likelihood).

## One dimension

If there is only one parameter, we can optimize the log likelihood using `optimize()`.

### Example with a type 1 Pareto distribution

We provide an example with a type 1 Pareto distribution. Note that in this example we treat the minimum as known and do not estimate it. Therefore this is a one-dimensional problem.

We use the `rpareto1()` (actuar) function to generate a random vector from a type 1 Pareto distribution with shape equal to 1 and minimum value equal to 500. We use the `dpareto1()` (actuar) function with option ` log = TRUE` to write the log likelihood. Then we just need to use `optimize()` with `maximum=TRUE`. We provide a minimum and a maximum value for the parameter with the `interval` option.

```> library(actuar)
> y <- rpareto1(1000, shape = 1, min = 500)
> ll <- function(mu, x) {
+    sum(dpareto1(x,mu[1],min = min(x),log = TRUE))
+   }
> optimize(f = ll, x = y, interval = c(0,10), maximum = TRUE)```

## Multiple dimension

• fitdistr() (MASS package) fits univariate distributions by maximum likelihood. It is a wrapper for `optim()`.
• If you need to program yourself your maximum likelihood estimator (MLE) you have to use a built-in optimizer such as nlm(), optim(). R also includes the following optimizers :
• mle() in the stats4 package
• The maxLik package

### Example with a logistic distribution

For instance, we draw from a logistic distribution and we estimate the parameters using .

```> # draw from a gumbel distribution using the inverse cdf simulation method
> e.1 <- -log(-log(runif(10000,0,1)))
> e.2 <- -log(-log(runif(10000,0,1)))
> u <- e.2 - e.1  # u follows a logistic distribution (difference between two gumbels.)
> fitdistr(u,densfun=dlogis,start=list(location=0,scale=1))```

### Example with a Cauchy distribution

For instance, we can write a simple maximum likelihood estimator for a Cauchy distribution using the nlm() optimizer. We first draw a vector x from a Cauchy distribution. Then we define the log likelihood function and then we optimize using the nlm() function. Note that nlm() is minimizer and not a maximizer.

```> n <- 100
> x <- rcauchy(n)
> mlog.1 <- function(mu, x) {
+   - sum(dcauchy(x, location = mu, log = TRUE))
+   }
> mu.start <- median(x)
> out <- nlm(mlog.1, mu.start, x = x)```

### Example with a beta distribution

Here is an other example with the Beta distribution and the optim() function.

```> y <- rbeta(1000,2,2)
> loglik <- function(mu, x) {
+    sum(-dbeta(x,mu[1],mu[2],log = TRUE))
+    }
>
> out <- optim(par = c(1,1), fn=loglik,x=y,method = "L-BFGS-B",lower=c(0,0))```

## Tests

### Likelihood Ratio Test

• lrtest() in the lmtest package[1].

## Some Specific cases

• gum.fit() (ismev package) provides MLE for a Gumbel distributon

## References

1. Achim Zeileis, Torsten Hothorn (2002). Diagnostic Checking in Regression Relationships. R News 2(3), 7-10. URL http://CRAN.R-project.org/doc/Rnews/