Seminar Details

Seminar Details


Nov 17

3:30 pm

Parallel Programming in R - An Easy Way of Accelerating Statistical Simulations

Hana Sevcikova


University of Washington - Department of Statistics

The time-consuming methods popular in modern statistics require efficient use of available computing resources. The potential benefits of parallel computing are well known but among statisticians are not widely realized in practice. I will demonstrate a straightforward overall parallel computing framework for statistical simulations. I will begin with an overview of hardware and software requirements. Starting with a simple parallel program structure based on the standard master-slave model, I develop a solution that overcomes several obstacles in parallel programming. These include non-reproducibility of results due to variations in the distribution of random numbers among processes, the creation of an excessive number of slaves, the proliferation of slaves with very short life times, and slaves destroyed due to hardware failures. The final program is independent of the particular application and thus can be used for a wide variety of other studies. In addition, it is portable between different hardware architectures. I will present an application of the framework to random fields analysis including bootstrapping. The programming language used for demonstrations is R making use of the add-on package RPVM -- an interface for the message passing library PVM. Nevertheless, the proposed framework is programming language independent.