welcome
information | seminars

1999 MCS Divisional Seminars & Colloquia


CUMULVS and Globus: Opening New Doors for Visualization, Computational Steering, and Fault Tolerance of High-Performance Scientific Simulations

   James Arthur Kohl
Oak Ridge National Laboratory
  Hosted by  Nicholas Karonis

10:30 AM, September 21, 1999
Building 221,  Room A-216


Abstract Scientific simulation continues to proliferate as an alternative to expensive physical prototypes and laboratory experiments. Such software-based research and development provides a cost-effective means for exploring a wide range of input datasets and variations in physical parameters. In conjunction with ubiquitous network connectivity these online experiments also provide a platform for collaboration with remotely located researchers - a feat not possible with traditional physical prototypes or experiments.

Much infrastructure is required to enable the development of these advanced computer simulations. Teams of scientists need to observe the ongoing progress of a simulation and share in its control. The user environment must withstand or recover from system faults or failures. Efficient handling of these issues requires expertise in computer science and a level of effort that the application scientist is not typically willing to expend. The CUMULVS system (Collaborative User-Migration, User Library for Visualization and Steering) provides an infrastructure for interacting with parallel and distributed simulations on-the-fly. Using CUMULVS, a team of geographically distributed researchers can each dynamically attach their own front-end viewer program to the same running simulation. With their viewers they can collaboratively monitor and control the simulation via interactive visualization and computational steering functions. The visual feedback from the simulation can provide insight to alter the course of the computation and steer it toward the desired solution. CUMULVS also provides a simple user-directed checkpointing mechanism to save the state of the simulation program periodically, for task migration or recovery from failures. Given semantic information (as provided by the application) CUMULVS can migrate and restart tasks across heterogeneous system architectures.

CUMULVS is being ported to work with the Globus system, on top of the Nexus communication substrate. This will increase the usefulness and applicability of CUMULVS by making it available to a larger user base. Nexus offers a rich and powerful interface for high-performance message-passing, and a callback mechanism for fault tolerance. The data management offered by the Metacomputing Directory Service (MDS) in Globus can be applied for application discovery. Together, CUMULVS, Nexus and MDS will add capabilities to Globus to support state-of-the-art simulation science.


[MCS | Research | Resources | People | Collaboration | Software | Publications | Information]