Parallel SCIMMS

(Simulated Chromatography of Interactive MacroMolecular Systems)

We are parallelizing SCIMMS, a simulation of small-zone size-exclusion chromatography. This program started as a sequential Fortran 77 program, and is now being parallelized using Fortran M. This work is a joint effort of members of Argonne's Mathematics and Computer Science Division (Steven Tuecke and Hania Aazami), and the Biological and Medical Research Division (Fred Stevens).

The Biology

The simulation will be used to extract the affinity and kinetics of interactions between proteins -- including discrete interactions (such as antibody with antigen) and polymerizations (such as the antibody light chains (Bence Jones proteins)) we have been studying. The significance of these parameters is two-fold. First, the affinity of antibody for antigen, and their kinetics, determine the biotechnological utility of the antibody. The affinity of light chain polymerization of light chains appears to determine their pathological propensity. Second, these numbers are relevant to fundamental protein structure-function study. The affinity of interaction is directly related to the free energy released when the molecules come together. In principle, this number can be computed by molecular dynamic simulation (given sufficient computing resources). Therefore, the quantitative data generated by rigorous simulation of the chromatography phenomena should contribute to optimization of the molecular dynamics algorithms that will eventually be used to predict protein structure based on amino acid sequence.

The Code

This is an iterative algorithm over a column, which is a 1-dimensional array of cells. Each cell contains concentrations for a small set of proteins within the cell. Initially, all concentrations except those in the left few cells are zero. During the course of the run, the concentrations will migrate from the left-most cells, through the column, to the right-most cells. Their arrival at the right-most cells will be monitored and output. Each iteration at each cell consists of the following stages: iter_kin() simulates the interactions of the proteins within the cell, adjusting the concentrations of each. In this stage there are no interactions between concentration in different cells. migrate() splits the concentrations of one cell between that cell and its nearest neighbors. In general, more of the concentration migrates to the right than to the left. monitor() watches the right-most cells to see what concentrations show up. absorbance() uses the data collected by monitor() to produce some summary values. report() reports the values (i.e., prints them to a file) produced by absorbance(). diffusion_1() further modifies the concentrations at each cell based on the concentrations at that cell and the its adjacent cells.

The Parallelization

Parallelizing this program entails spreading the cells across multiple processors, and replacing concentration reads/writes from non-local cells with communication. In addition, the monitor(), absorbance(), and report() routines are extracted into a global monitoring processes with which the column processes communicate. In the future we plan to look at load balancing issues in the code. In particular, cells with zero concentrations require far less computational resources per iteration that those with non-zero concentrations. Therefore, as concentrations migrate across the column, load imbalances may occur.