In the core of kriging, Generalized-Least Squares (GLS) and geostatistics lies the **multivariate normal (MVN) distribution** – a generalization of normal distribution to two or more dimensions, with the option of having non-independent variances (i.e. autocorrelation). In this post I will show:

- (i) how to use exponential decay and the multivariate normal distribution to simulate spatially autocorrelated random surfaces (using the
`mvtnorm`

package) - (ii) how to estimate (in
**JAGS**) the parameters of the decay and the distribution, given that we have a raster-like surface structure.

Both procedures are computationally challenging as the total data size increases roughly above 2000 pixels (in the case of random data generation) or 200 pixels (in the case of the JAGS estimation).

These are the packages that I need:

```
library(mvtnorm) # to draw multivariate normal outcomes
library(raster) # to plot stuff
library(rasterVis) # to plot fancy stuff
library(ggplot2) # more fancy plots
library(ggmcmc) # fancy MCMC diagnostics
library(R2jags) # JAGS-R interface
```

Here are some auxiliary functions:

```
# function that makes distance matrix for a side*side 2D array
dist.matrix <- function(side)
{
row.coords <- rep(1:side, times=side)
col.coords <- rep(1:side, each=side)
row.col <- data.frame(row.coords, col.coords)
D <- dist(row.col, method="euclidean", diag=TRUE, upper=TRUE)
D <- as.matrix(D)
return(D)
}
# function that simulates the autocorrelated 2D array with a given side,
# and with exponential decay given by lambda
# (the mean mu is constant over the array, it equals to global.mu)
cor.surface <- function(side, global.mu, lambda)
{
D <- dist.matrix(side)
# scaling the distance matrix by the exponential decay
SIGMA <- exp(-lambda*D)
mu <- rep(global.mu, times=side*side)
# sampling from the multivariate normal distribution
M <- matrix(nrow=side, ncol=side)
M[] <- rmvnorm(1, mu, SIGMA)
return(M)
}
# function that converts a matrix to raster and scales its sides to max.ext
my.rast <- function(mat, max.ext)
{
rast <- raster(mat)
rast@extent@xmax <- max.ext
rast@extent@ymax <- max.ext
return(rast)
}
```

## The Model

I defined the model like this: The vector of data is drawn from a multivariate normal distribution

where is a vector of pixel means and is the symmetrical covariance matrix. The (Euclidean) distances between pixels (sites, locations) are stored in a symmetrical distance matrix . The value of covariance between two locations is obtained by scaling the distance matrix by a negative exponential function:

where and . I chose the exponential decay as it well fits many empirical data, and it is simple, with just one parameter .

## Simulating random normal surfaces with autocorrelated errors

First, I explored how tweaking of affects the distance decay of covariance and the resulting spatial patterns. I examined , and :

```
Distance <- rep(seq(0,20, by=0.1), times=3)
Lambda <- rep(c(0.01, 0.1, 1), each=201)
Covariance <- exp(-Lambda*Distance)
xy <- data.frame(Distance, Covariance, Lambda=as.factor(Lambda))
ggplot(data=xy, aes(Distance, Covariance)) +
geom_line(aes(colour=Lambda))
```

Second, I simulated the surface for each of the values. I also add one pattern with a completely uncorrelated white noise (all covariances are 0). I set the mean in each grid cell to 0.

```
side=50 # side of the raster
global.mu=0 # mean of the process
M.01 <- cor.surface(side=side, lambda=0.01, global.mu=global.mu)
M.1 <- cor.surface(side=side, lambda=0.1, global.mu=global.mu)
M1 <- cor.surface(side=side, lambda=1, global.mu=global.mu)
M.white <-matrix(rnorm(side*side), nrow=side, ncol=side)
M.list <- list(my.rast(M.01, max.ext=side),
my.rast(M.1, max.ext=side),
my.rast(M1, max.ext=side),
my.rast(M.white, max.ext=side))
MM <- stack(M.list)
names(MM) <- c("Lambda_0.01", "Lambda_0.1", "Lambda_1", "White_noise")
levelplot(MM) # fancy plot from the rasterVis package
```

## Fitting the model and estimating in JAGS

This is the little raster that I am going to use as the data:

```
# parameters (the truth) that I will want to recover by JAGS
side = 10
global.mu = 0
lambda = 0.3 # let's try something new
# simulating the main raster that I will analyze as data
M <- cor.surface(side = side, lambda = lambda, global.mu = global.mu)
levelplot(my.rast(M, max.ext = side), margin = FALSE)
```

```
# preparing the data for JAGS
y <- as.vector(as.matrix(M))
my.data <- list(N = side * side, D = dist.matrix(side), y = y)
```

And here is the JAGS code. Note that in **OpenBUGS** you would use the `spatial.exp`

distribution from GeoBUGS module, and your life would be much easier. Not available in JAGS, so here I have to do it manually:

```
cat("
model
{
# priors
lambda ~ dgamma(1, 0.1)
global.mu ~ dnorm(0, 0.01)
global.tau ~ dgamma(0.001, 0.001)
for(i in 1:N)
{
# vector of mvnorm means mu
mu[i] ~ dnorm(global.mu, global.tau)
}
# derived quantities
for(i in 1:N)
{
for(j in 1:N)
{
# turning the distance matrix to covariance matrix
D.covar[i,j] <- exp(-lambda*D[i,j])
}
}
# turning covariances into precisions (that's how I understand it)
D.tau[1:N,1:N] <- inverse(D.covar[1:N,1:N])
# likelihood
y[1:N] ~ dmnorm(mu[], D.tau[,])
}
", file="mvnormal.txt")
```

And let's fit the model:

`fit <- jags(data=my.data, parameters.to.save=c("lambda", "global.mu"), model.file="mvnormal.txt", n.iter=10000, n.chains=3, n.burnin=5000, n.thin=5, DIC=FALSE)`

`## module glm loaded`

`## Compiling model graph ## Resolving undeclared variables ## Allocating nodes ## Graph Size: 10216 ## ## Initializing model`

This is how the posteriors of and `global.mu`

look like:

```
ggs_traceplot(ggs(as.mcmc(fit)))
```

```
ggs_density(ggs(as.mcmc(fit)))
```

The results are not very satisfactory. It looks like converges nicely around the true value of 0.3. The `global.mu`

, which should converge aroun 0, is a totally wobbly. The whole thing seems to have the right direction, but for some reason it cannot get there fully.

*I made this post during a flight from Brussels to Philadelphia on the 7th February 2014. There may be errors.*

About the values of global.mu - maybe run your chains a bit longer?

And discard the burn-in!

Hi Florian,

yes, that would be the way to go. The reason why I did not run it longer or in a better way was that I was impatient for Knitr and R to wrap it up.

Anyhow, the obvious and trivial reason for global.mu not to converge is that the posterior is just really wide. I think that this is a very interesting result -- it shows that the magnitude of spatial autocorrelation in the stochastic part (the errors; described by lambda here) of the model can reliably and instantly be retrieved from the data. But if this autocorrelation of the errors is strong enough, it can totally overdrive and inhibit our ability to retrieve the deterministic part (i.e. the expected value -- global.mu). I wonder if Dormann, Kuhn and all the classic autocorrelation guys had ever noted this...

Hi Petr,

given the correlation, the data has probably an effective size of a few data points. A wide posterior for mu is therefore expected. Jags shouldn't be affected by that though.

I'm quite sure the reason your model converges very slowly (an potentially biased) is the additional stochastic layer mu[i] ~ dnorm(global.mu, global.tau) that doesn't seem necessary, unless you want to test for that specifically. At least it is not present in the model you use to create the data. This layer slows down mixing, probably because of correlations, or because Jags has to use a different algorithm for this problem. If you remove this layer, mixing is fine and you can get the posterior with a few 1000 steps that looks all right.

I have posted a minimal example of your code modified in this way here https://gist.github.com/florianhartig/9078258

The original code would run fine with 100.000 steps or so as well, although there seemed to be a bit of bias on mu (not 100% sure though). It's quite tricky to set the prior correct for such a hierarchical problem, so a bit of bias at small sample sizes wouldn't come as a surprise.

Aaah! Yeah! Thanks a lot, that runs much better

All clear now.

I new to JAGS, and have been scratching my head as to how to add a linear regression component to the above model (such that y is a function of global.mu + beta1*continuous.variable). Any quick suggestions in how to do this, or suggestions of sources to take a look at incorporate spatial autocorrelation of residuals into linear models in jags? Many thanks

Hi Luke,

I guess that it should be pretty easy. Instead of using one global.mu, make mu[i] a function of the environment, exactly as you suggest. So that you will have:

model

{

# priors

lambda ~ dgamma(1, 0.1)

beta0 ~ dnorm(0.001)

beta1 ~ dnorm(0.001)

for(i in 1:N)

{

# vector of mvnorm means mu

# as a function of environment

mu[i] <- beta0 + beta1*environment[i] } # derived quantities for(i in 1:N) { for(j in 1:N) { # turning the distance matrix to covariance matrix D.covar[i,j] <- exp(-lambda*D[i,j]) } } D.tau[1:N,1:N] <- inverse(D.covar[1:N,1:N]) # likelihood y[1:N] ~ dmnorm(mu[], D.tau[,]) } I would be curious to hear from you if it worked. Petr

Hi Petr,

This is great, very useful to see an alternative to BUGS for geostatistical models. I have a couple of comments/questions. Firstly, you define your covariance matrix in your simulated data and your model based on the negative exponential exp(-lambda*d) which is what is used in BUGS, but I have a feeling this will give you correlation not covariance. The exponential covariance function is exp(-d/lamda) (see https://en.wikipedia.org/wiki/Covariance_function and the 'Exponential' function in the fields R package).

I ran a slightly modified version of your code with the new covariance function and this seemed to work nicely. I've also elaborated on it in response to Luke's comment to include an example of a logistic regression with a spatially structured random effect with a mean of 0. See here for the code

https://gist.github.com/HughSt/83e12dd43ae7157ba874

I'd be interested in people's thoughts. Also - I'm trying to figure out how to make predictions at new locations. I can do this in R post hoc, by using the samples of alpha and beta to generate underlying means and kriging (or conditionally simulating) the random effect using samples of lambda, but am wondering whether it is possible to do this in JAGS. Any ideas?

Thanks again for the post! Cheers, Hugh