# Simulated Maximum Likelihood with R

This article is originally published at https://www.brodrigues.co/

This document details section *12.4.5. Unobserved Heterogeneity
Example* from Cameron and Trivedi's book - MICROECONOMETRICS: Methods and
Applications. The original source code giving the results from table 12.2 are
available from the authors' site here and
written for Stata. This is an attempt to translate the code to R. I'd like to
thank Reddit user anonemouse2010 for his
advice which helped me write the function.

Consult the original source code if you want to read the authors' comments. If you want the R source code without all the commentaries, grab it here. This is not guaranteed to work, nor to be correct. It could set your pet on fire and/or eat your first born. Use at your own risk. I may, or may not, expand this example. Corrections, constructive criticism are welcome.

The model is \( y=\theta+u+\varepsilon \) where \( \theta \) is a scalar parameter equal to 1. \( u \) is extreme value type 1 (Gumbel distribution), \( \varepsilon \leadsto \mathbb{N}(0,1) \). For more details, consult the book.

### Import the data

You can consult the original source code to see how the authors simulated the data. To get the same results, and verify that I didn't make mistakes I prefer importing their data directly from their website.

```
data <- read.table("http://cameron.econ.ucdavis.edu/mmabook/mma12p2mslmsm.asc")
u <- data[, 1]
e <- data[, 2]
y <- data[, 3]
numobs <- length(u)
simreps <- 10000
```

### Simulation

In the code below, the following likelihood function:$$\log{\hat{L}_N(\theta)} = \dfrac{1}{N} \sum_{i=1}^N\log{\big( \dfrac{1}{S}\sum_{s=1}^S \dfrac{1}{\sqrt{2\pi}} \exp \{ -(-y_i-\theta-u_i^s)^2/2 \}\big)}$$which can be found on page 397 is programmed using the function `sapply`

.

```
denssim <- function(theta) {
loglik <- mean(sapply(y, function(y) log(mean((1/sqrt(2 * pi)) * exp(-(y - theta + log(-log(runif(simreps))))^2/2)))))
return(-loglik)
}
```

This likelihood is then maximized:

```
system.time(res <- optim(0.1, denssim, method = "BFGS", control = list(maxit = simreps)))
```

```
## user system elapsed
## 21.98 0.08 22.09
```

Convergence is achieved pretty rapidly, to

```
## [1] 1.101
```

which is close to the true value of the parameter 1 (which was used to generate the data).

Let's try again with another parameter value, for example \( \theta=2.5 \). We have to generate y again:

```
y2 <- 2.5 + u + e
```

and slightly modify the likelihood:

```
denssim2 <- function(theta) {
loglik <- mean(sapply(y2, function(y2) log(mean((1/sqrt(2 * pi)) * exp(-(y2 -
theta + log(-log(runif(simreps))))^2/2)))))
return(-loglik)
}
```

which can then be maximized:

```
system.time(res2 <- optim(0.1, denssim2, method = "BFGS", control = list(maxit = simreps)))
```

```
## user system elapsed
## 12.56 0.00 12.57
```

The value that maximizes the likelihood is:

```
## [1] 2.713
```

which is close to the true value of the parameter 2.5 (which was used to generate the data).

Thanks for visiting r-craft.org

This article is originally published at https://www.brodrigues.co/

Please visit source website for post related comments.