Parallel SCIMMS
(Simulated Chromatography of Interactive MacroMolecular Systems)
We are parallelizing SCIMMS, a simulation of small-zone size-exclusion
chromatography. This program started as a sequential Fortran 77
program, and is now being parallelized using Fortran M.
This work is a joint effort of members of Argonne's Mathematics and
Computer Science Division (Steven Tuecke and Hania Aazami), and the
Biological and Medical Research Division (Fred Stevens).
The Biology
The simulation will be used to extract the affinity and kinetics of
interactions between proteins -- including discrete interactions (such
as antibody with antigen) and polymerizations (such as the antibody
light chains (Bence Jones proteins)) we have been studying.
The significance of these parameters is two-fold. First, the affinity
of antibody for antigen, and their kinetics, determine the
biotechnological utility of the antibody. The affinity of light chain
polymerization of light chains appears to determine their pathological
propensity.
Second, these numbers are relevant to fundamental protein
structure-function study. The affinity of interaction is directly
related to the free energy released when the molecules come together.
In principle, this number can be computed by molecular dynamic
simulation (given sufficient computing resources). Therefore, the
quantitative data generated by rigorous simulation of the
chromatography phenomena should contribute to optimization of the
molecular dynamics algorithms that will eventually be used to predict
protein structure based on amino acid sequence.
The Code
This is an iterative algorithm over a column, which is a
1-dimensional array of cells. Each cell contains concentrations
for a small set of proteins within the cell.
Initially, all concentrations except those in the left few cells are
zero. During the course of the run, the concentrations will migrate
from the left-most cells, through the column, to the right-most cells.
Their arrival at the right-most cells will be monitored and output.
Each iteration at each cell consists of the following stages:
- iter_kin()
- migrate()
- monitor()
- absorbance()
- report()
- diffusion_1()
iter_kin() simulates the interactions of the proteins within the cell,
adjusting the concentrations of each. In this stage there are no
interactions between concentration in different cells.
migrate() splits the concentrations of one cell between that cell and
its nearest neighbors. In general, more of the concentration migrates
to the right than to the left.
monitor() watches the right-most cells to see what concentrations
show up.
absorbance() uses the data collected by monitor() to produce some
summary values.
report() reports the values (i.e., prints them to a file) produced by
absorbance().
diffusion_1() further modifies the concentrations at each cell based
on the concentrations at that cell and the its adjacent cells.
The Parallelization
Parallelizing this program entails spreading the cells across multiple
processors, and replacing concentration reads/writes from non-local
cells with communication. In addition, the monitor(), absorbance(),
and report() routines are extracted into a global monitoring processes
with which the column processes communicate.
In the future we plan to look at load balancing issues in the code.
In particular, cells with zero concentrations require far less
computational resources per iteration that those with non-zero
concentrations. Therefore, as concentrations migrate across the
column, load imbalances may occur.