Modeling and Simulation Software Enables Integrated Parallel Processing for Monte Carlo Analysis
Dramatic reduction in computation time can be achieved with clusters
Conrad J. Housand, Christine K. Kovach & Peter Schaefer, Ph.D.Modeling and simulation (M&S) is a well-established method for predicting the behavior of dynamical systems. Today’s simulation software comes packaged as easy-to-use M&S environments. These tools need to support both model developers and model users — allowing for the creation of complex models from scratch, but also supporting pre-built models for simulation experiments and analysis. In this article, we will focus on continuous simulation based on ordinary differential equations as mathematical models.
Figure 1: Master/slave structure for parallel Monte-Carlo analysis
For most physical systems, the governing equations for the dynamic behavior are well understood. Well-known examples include Newton’s laws for mechanical systems and Kirchoff’s rules and basic equations for resistors, capacitors, and inductors for electrical and electronic systems. Lesser-known are the dynamical equations for chemical and biological reactions describing distribution, composition, degradation, and metabolism of substance M&S environments that incorporate all required features into one suite of closely coupled programs. Other things to consider are:
• How is the new package supported?
• Will it provide enough modeling flexibility and execution speed?
• How are models verified/validated?
A thorough comparison of M&S software based on a feature list is crucial. A useful list of features should include:
• modeling and analysis support
• availability of problem-specific libraries
• accuracy of algorithms
• report generation capabilities
• hardware requirements
To protect investment into the future, buyers should demand the highest degree of flexibility and openness of the software package itself.
Figure 2: Master process
Before moving on, we want to discuss briefly the evolution from early general-purpose software to today’s specialized tools. When M&S tools appeared some 25 years ago, these were typically general-purpose software packages. Often little more than software libraries with some modeling support, the packages were applicable across the board — providing little specific support for any application area. Today, specialized tools exist for most applications. However, the perceived advantage of specialized tools might come at the price of unexpected restrictions that general-purpose systems seldom pose on the model developer. In fact, most general-purpose systems today come with powerful, application-specific libraries and methods that will negate any perceived advantage of specialized M&S tools.
Monte-Carlo analysis
Pharmacokinetics is an application area in drug development and risk assessment which includes using statistical methods to analyze models and to determine model parameters. One of the invaluable tools used in pharmacokinetics is Monte-Carlo analysis (MCA) — a method that uses statistical sampling to obtain approximate solutions to mathematical equations. The following problem is addressed by MCA: given probability distributions for inputs to a model, what are the corresponding distributions of the outputs of interest? Implementing MCA techniques to solve this problem sounds simple: over many iterations, sample from the input distributions, run the model using the sampled input values, and collect statistics on the corresponding outputs. Unfortunately, the power and simplicity of MCA comes at the price of computational effort, as many MCA experiments require thousands of simulation runs in order to obtain the desired level of confidence in the observed statistics. The good news is that MCA is a perfect example of an inherently data-parallel algorithm. That means, each MCA iteration is identical to all other iterations except for those with different input data. For most MCA studies, these iterations can all be performed independently which, in turn, means that by using parallel processing, it is possible to dramatically reduce the computation time required for MCA studies using networks of cooperating computers, or “clusters.”Parallel MCA Toolkit
In a truly integrated M&S tool, a user can expect not only an integrated MCA component, but also efficient execution. Using the off-the-shelf M&S tool acslXtreme, we developed a set of easy-to-use function blocks based on the Message Passing Interface parallel computing library [1], which are used to perform parallel Monte Carlo Analysis. In particular, MPI services are encapsulated as reusable simulation blocks, which can be easily incorporated into graphical models developed in acslXtreme.
Figure 3: Addition to simulation model for slave process
Although parallel programming libraries provide the framework for parallel computing, successful use of such libraries involves deep understanding of the details of their programming interface and of concurrent programming in general. We chose MPI for the Parallel MCA Toolkit implementation because of its maturity and rich programming interface (API) that allows for easy integration into other applications. MPI libraries provide a message-based protocol for communication between processes executing on heterogeneous networks of computers. Versions of MPI have been used successfully for “cluster supercomputers” of as many as thousands of networked PCs to solve highly parallelizable problems.
Figure 4: Observed output distribution from variation of blood: air partition coefficient
The Parallel MCA Toolkit has been implemented for acslXtreme, an integrated M&S environment that supports both graphical and textual modeling. Block libraries encapsulating low-level MPI routines and higher-level MCA-specific tasks were implemented using the following features:
• support of reusable graphical simulation blocks encapsulating CSL, C/C++ and Fortran code
•models are compiled into an executable Dynamic Link Library, which supports a documented API allowing the MPI libraries to call the compiled simulation
•models may call external subroutines — in our case the MPI functions.
The toolkit assumes a master/slave relationship between the master process, used to sample parameter values from the specified distributions and to control iterations, and the slaves, which are identical parallel worker processes performing simulation runs using the parameter values generated by the master.
Application to a life science model
Compartmental analysis is the most common method to develop physiologically-based pharmacokinetics (PBPK) models. Resulting models consist of physiological compartments connected via the vascular system. Time-dependent concentrations of substances in each compartment are described by a system of differential equations using physiological parameters and a partition coefficient. The partition coefficient is a constant encapsulating a variety of unknown details related to physiological properties of the compartments. Typically, model parameters are calibrated by statistically fitting the model to observed data.
Figure 5: Speedup graphs for parallel Monte-Carlo analysis
The existing acslXtreme chloroform exposure model includes liver, skin, fat, rapidly- and slowly-perfused tissue, and lung compartments connected via the vascular system. The model is used for risk assessment to determine the distribution and metabolism of chloroform after exposure through drinking water, inhalation and dermal contact. Partition coefficients and physiological parameters were taken from Yang, et al [2]. The Parallel MCA Toolkit was used to augment the acslXtreme model. For this study, the Blood:Air partition coefficient was varied by the master process and the resulting chloroform concentration in exhaled air was calculated by the slave processes.
To create the master process, we first constructed a component which defines the variation of the model inputs. The “Blood:Air PC” block generates random values for the parameter as defined by the user. The “MC Master” block sends generated values for the partition coefficient to the slave process, controls the execution loop for the MCA, and records parameter values and returned data to a results file.
To create the slave processes, the “MC Slave” block that communicates with the master process was added to the model. The received value for the Blood:Air partition coefficient sets the associated value in the lung compartment. Similarly, the observed output (chloroform concentration in exhaled air) is routed into the slave block to be sent back to the master process. While the input to the slave process is a single value, the output is a time history, in this case an array of about 100 values.
Although the purpose of this example is to demonstrate the quantitative effect of parallel MCA execution — the “speed-up” or performance gain using multiple processors — we need to verify that parallelization of the model did not affect the validity of the simulation. We followed Yang, et al [2] and matched the normal distribution of the Blood:Air partition coefficient to the calculated mean and standard deviation documented in the paper. To achieve a sufficient match between the desired distribution and the input generated by our master process, we needed a minimum of 500 simulation runs. The resulting concentration of chloroform in exhaled air is shown in the picture: it depicts the time history of the average concentration for a set of 500 MC iterations along with the curves representing three standard deviations above and below the mean.
The number of processors and simulation run length were considered independent variables, while the computational speedup (defined by the ratio of single processor to multiple processors computation time) was captured as the dependent variable. To build our cluster, we used a Pentium 4, 2.8 GHz, 1 GB, Windows XP computer to run the master process and a variety of four Pentium II and Celeron, 300 MHz to 375 MHz, 64 MB to 128 MB, Windows 2000 computers as slave machines – all connected by an Ethernet network.
For calculating speedup data, a series of runs were performed systematically varying the number of slave processors and the length of each simulation run by changing the simulated time. In order to accurately calculate the speedup factor, we applied the concept of “effective processor count” to account for the different computational power of the slave machines.
Results and conclusion
The graph in Figure 5 presents speedup curves for the experimental runs, showing eight speedup factor curves representing simulation run lengths ranging from 1 to 200 hours of simulated time. The x-axis is the “effective” number of processors.Longer simulation runs with more than 100 simulated hours show the expected speedup characteristics: speedup is proportional to approximately 0.7 times the number of processors. For short runs (10 simulated hours), communication overhead dominates the computation time. For runs up to 50 simulated hours, it appears that speedup reaches the point of diminishing returns even with a relatively small number of processors.
Results confirm the expected potential for substantially faster execution of MCA on computer clusters for longer simulation runs. Success of this technique with MCA suggests applicability to other analytical methods such as minimization/maximization, sensitivity analysis and parameter estimation.
While implementation of the initial version of the toolkit was straightforward, further development will be required to finalize the toolkit. For example, the current version needed some additional programming to coordinate the communication between the master and slave processes and to configure the standard MPI processes.
References
[1] Website of the MPI Forum: www.mpi-forum.org [2] Y. C. Yang, M. Ouyang, X. Xu, S. W. Wang, P. G. Georgopoulos; Incorporation of Inter- and Intra-Individual Variability in a Chloroform PBPK Model with Inhalation and Dermal Absorption; Poster presented at Society for Risk Analysis 2003 Annual Meeting; (2003).Conrad J. Housand is Senior Software Engineer, Christine K. Kovach is Senior Applications Engineer, and Peter Schaefer, Ph.D., is Director of Product Engineering at AEgis Technologies Group. They may be contacted at [email protected].