I recently made an attempt to run MCMC sampling in OpenBUGS using a large dataset and a spatially explicit occupancy model. Here I report some potentially interesting speed and memory issues that I noticed.
Model and Data
I won't go into technical details of my model as it is not the main focus of this post. In brief, I am modelling a geographic distribution of a certain species as a function of environmental conditions (plus, there is a Conditional Autoregressive component). The model uses data points (grid cells) are spread continuously over the whole United States like this:
So there is around 20,000 grid cells within US and I am modelling the probability of occurrence of the species within each of the grid cells.
Hardware & software setting
The machine I am using is a Linux-operated cluster with 3-GHz individual cores and 192GB of RAM. I am using OpenBUGS to do the MCMC sampling and I call it from within R using the BRugs package. In order to speed things up a bit, I paralellized three MCMC chains using a similar approach to what I described in my older post, using the snow and snowfall packages.
Speed and memory issues
I will focus only on singe-chain MCMC sampling now. Running a single-chain burn-in (no parameters monitored) in this setting is smooth and, even with this large dataset, relatively fast (~1 hour to run 150,000 burn-in iterations). An interesting issue emerged when I decided to monitor posterior distributions of the predicted values in each of the 20,000 grid cells (in order to obtain map of prediction intervals). A task that is memory-hungry by definition.
I monitored elapsed time and memory usage of each OpenBUGS step. Here are the results for two very short MCMC chains (a 100 and b 1000 iterations) during which I monitored predicted values in all of the 20,000 grid cells:
First, you can see that the machine spends substantial time on steps like setting the monitored parameters or writing of summary files. It also looks like it takes some time to start and stop the MCMC sampling during the monitoring phase - in fact so much that there is not a substantial difference between a) and b). Second and more importantly, it looks like OpenBUGS allocates 64MB of static memory to the whole procedure and this allocated memory does not change as long its contents are <64MB.
This is what happens when I turn the volume of iterations up to 10,000 (c) and even to 650,000 iterations (d; note that I used 10 thinning steps):
Obviously, everything takes longer. But look at the memory use - it looks like when the amount of data to store exceeds 64MB, OpenBUGS starts to allocate memory dynamically. And each additional set of iterations is slower than the previous one. Moreover, in case of d) the whole thing crashes after the memory usage reaches 1,7GB. My machine has 192GB of RAM so this can't be the reason. I suspect that there is something clumsy in the way OpenBUGS allocates large chunks of memory. A heap overflow?
A colleague of mine remarked that there is a rumor in the BUGS community that some MCMC samplers have a threshold of speed - above certain data size the whole thing all of a sudden goes slow. Is this what happened to me? Do the OpenBUGS developers know about it? Or can the problem be somewhere else? Is there a way to avoid this problem? Any ideas or comments welcomed!