Status: November 1998

Projects updated January 1999

Notes of the 1st International Workshop on
 
 

"Desktop Access to Remote Resources"

Java Grande Working Group on Concurrency/Applications

http://www.javagrande.org

and

Argonne National Laboratory, MCS Division

http://www.mcs.anl.gov
 
 
 
 

Meeting Web Site and Location

http://www-fp.mcs.anl.gov/~gregor/datorr

Argonne National Laboratory

Building 221, Room A-261

October 8-9, 1998
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Editorial Contact: Gregor von Laszewski, gregor@mcs.anl.gov

The list of participants and contributors can be found on the Datorr website and

will be included upon completion of the report.




Table of Contents
 
 

Table of Contents *

Meeting Notes to the "1st Workshop on Desktop Access to Remote Resources" *
Working Group Summaries *

Working Group 1: *
User/Application view of the System Architecture *
Definitions: *
User Scenario: *
Working Group 2: *
The Interface to desktop access: the Metacomputing API *
Fundamental Abstractions *
Services *
The Service API s *
Working Group 3: *
Secure Desktop Access to Remote Resources *

Project Descriptions *

Akenti – A Distributed Access Control System *
Arcade: A Web-Java Based Framework for Distributed Computing *
CIF *
Common Component Architecture Forum *
Condor *
CPU-Usage Monitoring with DSRT Sytem *
CUMULVUS *
Distributed Resource Management for DisCom2 *
DiscWorld *
EveryWare *
Globus: The Globus Grid Computing Toolkit *
GlobusJ *
Habanero *
Harness: Dynamic Reconfiguration and Virtual Machine Management in the Harness Metacomputing System *
IceT *
JINI *
Llava *
Legion System Architecture *
NCSA Workbench Project *
Netsolve *
Ninja *
Northrhine-Westphalian Metacomputer Taskforce *
PAWS *
PARDIS *
POEMS *
Sweb *
Teraweb *
UCLA Java Effort For Collaborative Computing *
UNICORE: Uniform Access to Computing Resources *
WebSubmit: A Web Interface to Remote High-Performance Computing Resources *
WebFlow: Web Based Metacomputing *


 
 

Meeting Notes to the "1st Workshop on Desktop Access to Remote Resources"

On October 8 and 9, 1998 the "1st Workshop on Desktop Access to Remote Resources" was held at Argonne National Laboratory in conjunction with the "Java Grande Working Group on Concurrency/Applications".

The term "remote resource" is defined to include:

This first meeting concentrated on issues related to compute resources. The goal of this meeting was to

In the first part of the meeting, short project presentations including Condor, Globus, UNICORE, Websubmit, Arcade, Webflow, Tango, DisCom2, Akenti, IGP, and others, gave an overview of the projects in order to initiate discussions. These discussions were continued during working group sessions in the second half of the meeting. Three working groups were defined according to the following topics:

The working groups tried to identify issues related to the design of an architecture, which makes a desktop access to remote resources possible. This included identifying a list of services, which is planned to be developed in order to enable a seamless desktop access. The working groups identified four types of user interfaces, which are related to the "users" accessing a metacomputing "Grid."

invoke a Grid based problem solving environment.

The interface-working group identified the need for fundamental abstractions like tasks and jobs, resources, events, file names and object handles. It was determined that services like resource services, accounting, notification and event services, transaction services, logging services, keep-alive services, collaboration services, and execution services have to be defined.

The user requirements group focused on identifying services that are needed by the users. The user requirements were ultimately included into the services identified by the interface group. A separation between the graphical user interfaces and the services to support the desktop access to remote resources were found to be important. Besides the development of component building tools such as Arcade, Gecco, and Webflow, the participants viewed the development of shell commands as very important.

The third group focused primarily on issues related to a "secure desktop access to remote resources." The security group had a short second meeting two weeks after the workshop to discuss issues related to secure data exchange, access control, authentication, as well as the need for simple administration.

During the workshop, it was decided to produce a report to be distributed at SC98 and available as a technical report from Argonne National Laboratory. The report contains a summary of the working group results, as well as, a collection of one-page descriptions of related projects. We expect the report to evolve over time, so that a distribution via WWW seems appropriate.

In order to help exchanging ideas, a mailing list (datorr@mcs.anl.gov) is established and a WWW is available (http://www-fp.mcs.anl.gov/~gregor/datorr). The WWW page will contain future announcements of this group, and is intended as a collection of resources including the online proceedings of the workshop (currently, a set of slides presented at the meeting).

Two follow up meetings are already scheduled. The first meeting takes place as "Birds of the Feather Meeting at SC98", while a second meeting on the 28th and 29th of January, 1999 takes place at Sandia National Laboratory, Albuquerque. In the next meeting, we would like to give other projects the opportunity to present their work. Future presentations may focus on how components developed by the project can be reused by other projects and how an integration of technologies between projects can be achieved. Nevertheless, the main focus of the meeting should be the discussions in the working groups.
 
 
 
 

Working Group Summaries

Working Group 1:

User/Application view of the System Architecture

In this section we examine the requirements for distributed access to remote resources from the perspective of the users. In particular we describe the various steps that a user might take in setting up and executing an application which requires both remote and local resources. Note that we do not discuss the requirements for supporting multiple users

collaborating to solve a single problem, rather we focus on a single user only. We first present some definitions and then describe a scenario from the user's perspective:

Definitions:

Resource:

A resource, represented by a "resource object", includes both hardware and software resources that might be required by the user for the application. These include, but are not limited to:

- compute engines, ranging from workstations to supercomputers

- networks

- storage devices

- visualization hardware

- instruments, e.g., wind tunnels, telescopes, etc.

- databases

- application codes and binaries

- libraries

- compilers

- debuggers

- performance monitors

- visualization tools
 
 

User:

We distinguish between at least two different types of users:

administrators/super users and end users. The former are responsible for the overall maintenance and control of the system while the latter utilize the resources to solve their problems. In this section we focus on the end users only. Each user will be represented in the system by a "user object" which will contain all information directly related to the user.
 
 

User workspace:

Each user of the system will have a personal environment which will represent a logical view of the global resources available to the users. This workspace will include both real and virtual resources which the user can access. The former includes resources which the user has direct control over, e.g., the desktop on which he/she is logged on, his/her programs, data etc. In addition to these resources, the

user may have access to shared resources such as other workstations, shared file systems, visualization equipment etc. Going beyond the local environment, the user may have access to other remote resources, such as supercomputers. The user's workspace represents an agglomeration of all such resources available to the user.

Task:

A task, represented by a "task object", can be defined hierarchically as follows:

A simple_task consists of the following:

Here "action" is a single unit of work.

Examples of a simple_task are:


A task is, then,


Job:

A job is an instantiation of a task and is represented by a "job_object." In other words, a job is generated by binding the resources required to execute the task including data resources for input and output and computing resources for executing the actions embodied by the task.
 

User Scenario:

Consider a scenario in which a user is trying to execute an application to solve the problem at hand. We describe here a set of phases/stages that the user must go through in order to accomplish the task. We assume that the problem at hand requires a complex set of

actions executed on several local and remote resources. In the ensuing description, we focus on the services required by the user at each stage of the whole process and do not discuss interface issues. We

presume multiple different front-ends can (and will) be built which rely on the underlying services to allow access to local and remote resources. Also, note that we focus here only a single user ignoring the issues of collaboration and cooperation that may arise if multiple

users have to coordinate their actions in order to solve the problem.

Login to the framework:

Before a user can start using the resources of the system, he/she has to login. This requires a authentication process which will establish the identity and the credentials of the user This process would also generate a user object to represent the user and store any information pertinent to the user. We assume that the user will

have a single system-wide "id" which will be used for later authorization to access local and remote resources. This may require mapping from the id to a local id to utilize resources under separate jurisdictional domains.

Managing the user workspace:

The user requires mechanisms to manage his/her personal workspace. Thus the user may want to add new resources, delete resources which are no longer available and may even want to monitor the current status of the resources. Such monitoring can be envisioned to be continuous, on demand or even at user defined significant events.

Task construction:

A task is a complex entity which needs to be specified before it can be utilized to solve the problem at hand. This process is mainly an interface issue, i.e., what kind of interface is available to the user for setting up the task. For example, a script-based system would allow a

user to specify all the parts on a textual basis. On the other hand, a visual system would allow cf-users to drag and drop icons representing pre-defined tasks and connect them graphically to indicate dependencies. Essentially this construction requires the ability to peruse and load pre-defined tasks from the personal workspace. The result of

this process is a task object which hierarchically represents the actions to be performed, their input/output requirements and the dependencies between these actions.

Job submission:

Once a task has been created, it can be submitted for

execution. The major issue here is resources allocation. Some resources maybe specified explicitly by the user. For example, the user may bind a specific data file to an input, or may specify that a particular code be executed on a specific computing resource. However, we presume that a major portion of the resource allocation, in particular, the computing resources will be automatically allocated by the underlying system. Thus, we envision a matchmaker" which attempts to find the "best" match between the requirements and constraints of the task to the capabilities, constraints and the status of the resources in the personal workspace of the user. Several issues arise in this process. The definition of "best" maybe dependent on the user/task and maybe a part of the requirements. For example, some jobs maybe optimized for throughput while others maybe optimized for performance. Also, the match making process maybe completely static, i.e., all resources are chosen before the job starts executing. On the other hand, the process maybe more dynamic in the sense that the resources for a sub_task are bound only when the task is ready for execution. The latter would allow the system to take into account the dynamic nature of the workspace. Since, in general, finding an exact match may be difficult, that underlying system may provide a negotiation capability which would allow the user (or an agent process) to make intelligent trade-offs in the resource allocation process.

The result of the job submission would be a "job object" which would encompass all the information about the job. It is the underlying system's responsibility to manage the execution of the job including the management of the intermediate data. A data-flow paradigm could be used to determine when a action needs to started on designated set of resources. Starting a action on a remote resource may require authorization for the user to utilize the resource. Similarly, an accounting module may keep track of the amount of resources used.

Job Management:

As the job progresses, the underlying system needs to provide support for management of the jobs. This includes monitoring the execution status of the job, i.e., which sub_tasks have been completed and which are waiting. The user should also have the capability of suspending/restarting, checkpointing, and killing the jobs.

In addition to monitoring the execution progress of the job, the user may want to monitor the intermediate data being produced by the sub_tasks. Based on this, the user may want to steer the computation. Such steering may take different forms. For example, at the outermost level, the user may replace the execution code to be used for a task in order to control the fidelity of the simulation. Similarly, the user may want to change the arguments input to a code. At finer level, the user may want to interact directly with an executing code changing the intermediate data values being used by the code. Thus, the underlying system has to support multiple levels of steering support.

In the above paragraphs, we have described the various steps a typical user would take in order to construct and execute a job. We have also tried to indicate the support that the underlying system has to provide at each stage. This is a preliminary document. A much more in depth study is required in order to provide a detailed set of requirements from the user's perspective.
 

Working Group 2:

The Interface to desktop access: the Metacomputing API

Each metacomputing system provides a set of services that constitute the components of the system architecture. While these services differ slightly from one architecture to another, there are enough shared characteristics that it is possible to abstract them and provide a "standard" API from which desktop user tools may be constructed. In fact, there are four types of user interfaces:

  1. The End User who wishes to submit a single job, multiple instances of a job, access a collaboration or invoke a Grid based problem solving environment.
  2. The System Administrator who is responsible for the installation of grid tools and the addition of new resources into the metacomputing system
  3. The developer of applications that take advantage of advanced grid services
  4. The owners of the metacomputing resources that comprise the Grid.

In the paragraphs that follow we describe the common services and data abstractions and the associated APIs needed to build the desktop tools for these classes of users. Note that we have not identified all the services nor all interactions between services that go into building a metacomputing system. Rather we have focused on those services for which there are user interface interactions and dependencies. Furthermore we most certainly have omitted or oversimplified in many cases. Suggestions are always welcome.

Fundamental Abstractions

There are many basic data type classes that go into the design of a metacomputing system. There are, however, four core data types that are fundamental to submitting and managing jobs on the Grid.

Services

The following are the core services that must be abstracted by the APIs.

Resource Services.

There are three basic resource services.

Java Jini has a complete and sophisticated resource discovery system. Globus MDS provides the discovery system in that environment. In Condor, the matchmaker service proved a different model for resource discovery.

Accounting.

Accounting services can be derived from resource monitoring, but they are so important that we list them separately.

Notification and Event Services.

Resources such as compute services and networks come and go from the system. As the system changes state events are generated and logged. In addition, Jobs change state as they progress (or fail). Jobs can be preempted or rescheduled as systems fail or resources change state. In addition, tools such as performance monitors can be driven by event services. Distributed debuggers also require event monitoring. Desktop applications can register to be notified of events of many different types.

Transaction Services

Basic transaction services assure consistency in distributed systems. From the perspective of desktop access to the grid, tasks such as job submission require consistency: when a job is submitted, we need to know that one and not two or zero instances were created. When a special resource is leased for a job and later released this requires a reliable transaction.

Jini uses a classical two-phase commit protocol. Other systems provide similar services.

Logging Services

Not all desktop applications can be assumed to be actively monitoring events. In some cases, the application, when invoked would consult a log file for a record of events and transactions. The logging service provides a standard protocol for recording and storing event and system logs that can be access by other tools.

Keep-Alive Service

Some systems, such as directory services and file systems must be kept alive. When their agent processes die, it is essential that they be restarted. Keep-Alive services periodically check to make sure that critical functions are still operational.

Collaboration

Collaboration Services provide the backbone with which multiple, simultaneous users can share desktop tools.

Communication services

Application need to communicate with each other, the filespace as well as user tools. For example, standard error and I/O streams as well as specialized communication protocols such as Globus Nexus must be provided as a service that is available to the desktop system.
 
 

"Do it" Services

The basic task of interacting with a job in the system takes several different forms. The simple transaction of running a job is one. In other cases, more complex interaction is required and uses shell like services. For example, pausing, killing a job. In other cases the job execution can be managed through scripting interfaces. In both cases, these are command interpreters which provide the desktop user fine-grained control over execution.

The Service API s

The APIs listed below provide the interfaces to the services that will allow us to build the desktop tools. In many cases these are in one-to-one correspondence with the corresponding system services and in other cases they represent a functional interface that interacts with more than one of the core services. The important point is that different metacomputing environments differ in the way they implement or organize services, but the API should be consistent across different platforms to make it possible to design portable, seamless tools.

Job initiation and execution management API

This API provides the basic mechanisms to start or enqueue a job for execution. It also provides the interface to the basic runtime management of job. For example, killing or suspending a job, checkpointing or serializing a job, modifing a job's running properties.

In addition this API should provide a mechanism for application specific user interface extensions such as attaching a GUI for steering or attaching a debugger.

Event and Notification APIs

Some applications and most system components will generate events. This API allows user interface tools to register as listeners or to receive notification. In addition for more passive tools, there is a need to inquire about past events or to see event logs.

Job and Resource Status and Monitoring API

Given a handle to an abstract job, what is its status? Given a resource object, what is its status?

This interface also provides for tools that provide dynamic performance monitoring.

Security

The security interface is defined by the security subgroup and user requirements group.

Transaction Service API

This interface is a standard, two-phase commit protocol such as provided by Jini.

Informational

There are three parts to the informational API.

Working Group 3:

Secure Desktop Access to Remote Resources

The notes of this working group are attached at the end of this paper.
 
 
 
 

Project Descriptions
 
 

Akenti – A Distributed Access Control System

Srilekha S. Mudumbai, William Johnston, Mary R. Thompson, Abdeliah Essiari, Gary Hoo, Keith Jackson

http://www-itg.lbl.gov/Akenti

Lawrence Berkeley National Laboratory, Berkeley, CA 94720

Introduction

Akenti is an access control system designed to address the issues raised in managing access to distributed resources that are controlled by multiple remote stakeholders. Akenti enables stakeholders securely to create and to distribute instructions authorizing access to their resources. Akenti makes access control decisions based on a set of digitally signed documents that represent the authorization instructions and the relevant user characteristics. Public-key infrastructure and secure message protocols provide confidentiality, message integrity, and user identity authentication, during and after the access decision process.

Akenti Access Control Fundamentals

The resources that Akenti controls may be information, processing or communication capabilities, or physical systems such as scientific instruments. The access that is sought may be to obtain information from the resource, to modify the resource, or to cause that resource to perform certain functions. Remote access to a resource is typically provided by a network-based server acting as a proxy for the resource. A user gains access to the resource via a client program. The client participates in the following authentication and verification steps before gaining access to the resource.

Components of the Model

Authority file: This is information stored on the resource server associated with each resource. It consists of relatively static information that specifies who can provide access information and where this information is stored and what X.509 Certificates Authorities are trusted.

Identity (X.509) certificates: These certificates bind an entity’s name to a public key. They are stored in LDAP (Lightweight Directory Access Protocol) directory servers from which Akenti obtains them to verify a subject’s identity.

Use-condition certificates: These are signed documents, remotely created and stored by resource stakeholders, that specify the conditions ("policy") for access to the resource.

Attribute certificates: These are signed documents, remotely created and stored by a third party trusted by the stakeholder, that certify that a user possesses a specific attribute (for example, membership in a named group, completion of a certain training course, or membership in an organization).

Implementation Status

Web Server

Currently, Akenti is used by the SSL-enabled Apache web server in order to provide policy based access control. The Apache server’s native access control mechanism has been replaced by an interface to the Akenti policy engine. This server is being used to provide access control of experimental results for the DOE Diesel Combustion Collaboratory

CORBA ORB

Akenti access control has been added to a CORBA ORB (Orbix including the SSL communication protocol). The DOE MMC Collaboratory is planning to use this ORB to launch a microscope control server. An ORB defines a callout function that is invoked whenever a request is initiated. This callout function is used to invoke the Akenti policy engine that identifies stakeholders and acquires, validates and analyzes the various certificates The result of this analysis will either permit the requested access or deny it, depending on whether or not the user satisfies the use conditions.

Future Work

Future work will include deploying a mobile agent system with Akenti access control; integrating Akenti access control with Globus resource management software in a secure compute server and a network bandwidth broker; and integrating Akenti access control with Sandia’s PRE CORBA-based remote invocation system.
 
 

Arcade: A Web-Java Based Framework for Distributed Computing

Arcade is a Web/Java based framework designed to provide support for a team of discipline experts to collaboratively design, execute and monitor multidisciplinary applications on a distributed heterogeneous network of workstations and parallel machines. This framework is suitable for applications such as the multidisciplinary design optimization of an aircraft. Such applications in general consist of multiple heterogeneous modules executing on independent distributed resources while interacting with each other to solve the overall design problem.

The Arcade architecture consists of three-tiers:

First Tier The first tier provides the user interface to the system. It comprises of the application specification interface, the resource allocation and execution management interface, and the monitoring interface.

The application design interface provides both visual and textual (script-based) mechanisms for the hierarchical specification of execution modules and their dependencies. In addition to normal executable modules, the system supports hierarchical modules, SPMD modules for MPI-based codes, and control structure modules, e.g., conditional and loops.

The resource allocation and execution interface provides support for specifying the hardware resources required for the execution of the application. The resources are currently specified explicitly by the user. We envision a system where the resources are automatically allocated based on the current and predicted loads of the system and the characteristics of the application. The system will also allow the users to choose the input/output files and any command line arguments for the modules prior to starting the execution.

The monitoring and steering interface will allow users to monitor both the progress of the execution and the intermediate data flowing between the module. It will also allow the users to steer the overall computation by modifying data values and also replacing execution modules to control the fidelity of the computations.

Middle Tier The middle tier consists of logic to process the user input and to interact with application modules running on a heterogeneous set of machines. The overall design is a client-server-based architecture in which the Interface Server interacts with the front-end client to provide the information and services as needed by the client.

A Java Project object internally represents each application. The Project object, consisting of a vector of module objects, is the central object in our framework. All the information related to the application, both static and dynamic is stored within this object. When the user requests the execution of an application, the User Interface Server passes the corresponding Project object to the Execution Controller (EC). It is the EC’s responsibility to manage the execution and the interaction by firing up user modules on the specified resources as and when dictated by the dependencies specified by the user.

Third Tier The third tier consists of Resource Controllers (RC) and User Application Modules. Each active resource in the execution environment is managed by a RC which is responsible for launching the User Application Modules on the resource and also for interacting with the Execution Controller in order to keep track of the executing applications.

The main advantage of a three-tier system is that the client or the front-end becomes very thin, thus making it feasible to run on low-end machines. Also, since most of the logic is embodied in the middle tier, the RCs can be kept lightweight thus keeping the additional loads on the executing machines to a minimum also.

The overall goal is to design an environment which is easy to use, easily accessible, portable and provides support through all phases of the development and execution of multi-disciplinary applications. We plan to leverage off of commodity technologies, such as the Web and Java, to implement various parts of the environment. The current prototype of the system is capable of executing distributed applications on a network of resources in a single domain interacting with each other through files. We are currently expanding the system to incorporate many new facilities, including multi-domain execution and collaboration support in all phases of the design and execution of the overall application. More information on the Arcade system can be found at http://www.icase.edu/arcade.

This research is being supported by the National Aeronautics and SpaceAdministration under NASA Contract No. NAS1-19480.

 

CIF -- DOE2000 Collaboratory Interoperability Framework Project

http://www-fp.mcs.anl.gov/cif/

The DOE2000 Collaboratory Infrastructure Framework project is investigating the software technologies required to support the development of DOE collaboratories and the interoperation of collaboratory components and tools. The goal of the project is to improve software quality, reduce duplication of effort, enhance interoperability, and promote interlab cooperation by developing a common software infrastructure that can be shared by many projects. This infrastructure will provide essential distributed computing functions such as resource location, data transport, security, and multicast services.

Common Component Architecture Forum (CCA Forum)

http://www.acl.lanl.gov/cca-forum/

The Objective of the Common Component Architecture Forum (CCA Forum) is to define a minimal set of standard features that a High-Performance Component Framework has to provide, or can expect, in order to be able to use components developed within different frameworks. Such standard will promote interoperability between components developed by different teams across different institutions. This document will explain the motivation and assumptions underlying our work and monitor our progress.

 

CUMULVUS

http://acts.nersc.gov/cumulvs/main.html

CUMULVS (Collaborative User Migration User Library for Visualization and Steering) is a software framework that enables programmers to incorporate fault-tolerance, interactive visualization and computational steering into existing parallel programs. The CUMULVS software consists of two libraries -- one for the application program, and one for the visualization and steering front-end (called the "viewer").

CPU-Usage Monitoring with DSRT Sytem

Webpage: http://dast.nlanr.net/Projects/monitor.html

Klara Nahrstedt, CS dept, UIUC <klara@cs.uiuc.edu>

Kai Chen, NLANR/DAST, NCSA <roland@nlanr.net>

Roland Geisler, NLANR/DAST, NCSA <kchen@nlanr.net>

This is the project is a joint project is in its initial stage. The plan is to produce a detail design at the end of this year, and start coding from next year. It is supposed to finish by May 1999.

Description

DSRT is a dynamic soft real time system sitting on top of a general purpose UNIX system with real time extension. It supports several service classes for multimedia applications based on their CPU usage time pattern [1]. This project is to design and implement a distributed CPU-usage monitoring tool with DSRT system under the context of the Globus environment. Currently Heartbeat Monitor (HBM) is running as a process monitoring tool in the Globus infrastructure. It collects some basic status information for a process, such as active, blocked and abnormal [2]. We plan to take the architecture of HBM, extend its functionality to work with DSRT system, and report more status information for a real time process. Currently we plan to get the following information from the DSRT system for different service classes:

PCPT (Constant Processing Time): Period, Peak Processing Time;

VPT (Variable Processing Time): Period, Sustainable Processing Time, Peak Processing Time;

OCPT (One-Time Constant Processing Time): Start Time, Interval, Peak Processing Time;

ACPT (Aperiodic Constant processing Time): Peak Process Rate.

References:

[1] Hao-hua Chu and Klara Nahrstedt, CPU Service Classes for Multimedia Applications, Technical Report UIUCDCS-R-98-2068, UILU-ENG-98-1730, Department of Computer Science, University of Illinois at Urbana Champaign, August, 1998.

[2] Globus Heartbeat Monitor, http://www.globus.org, October, 1998.
 
 

Distributed Resource Management for DisCom2

The Distance Computing and Distributed Computing (DisCom2) program is a DOE multi-lab program that complements the mission of DOE's Accelerated Strategic Computing Initiative (ASCI). The ASCI program was established to create the computational modeling and simulation capabilities essential to shift from nuclear and nonnuclear test-based methods of assessment and certification to computation-based stewardship of the enduring nuclear stockpile. The DisCom2 program will accelerate the ability of the Defense Programs complex to remotely access the high-end and distributed computing resources by creating a simulation intranet. Distance computing emphasizes remote access to the ASCI-class supercomputers for capability computations, where the goal is to provide maximum resources to a single computation. Distributed computing emphasizes remote access to other distributed resources for capacity computations, where the goal is to provide some resources to the maximum number of computations.

The Distributed Resource Management (DRM) project will provide the capabilities and services needed to access and manage the high-end simulation resources dispersed throughout the DP complex. The simulation intranet includes high-performance storage, network, and visualization resources that must be managed along with the computing resources. Other remote resources envisioned include databases and utility executables such as file format conversion routines. To achieve its goal, the DRM project is divided into several tasks:

Condor

Miron Livny, miron@cs.wisc.edu

http://www.cs.wisc.edu/condor

http://www.cs.wisc.edu/condor/publications.html

Building on the results of the Remote-Unix (RU) project and as a continuation of our work in the area of Distribute Resource Management (DRM), the Condor project started at the Computer Sciences Department of the University of Wisconsin-Madison in 1988. Following the spirit of its predecessors, the project has been focusing on customers with large computing needs and environments with large collections of heterogeneous distributed resources. For more than a decade, we have been developing, implementing, and deploying, software tools that

can effectively harness the capacity of hundreds of distributively owned workstations. The workstations can be scattered throughout the globe and may be owned by different individuals, groups, or institutions. Using our software, the Condor resource management environment, scientists and engineers have been simultaneously and transparently exploiting the capacity of computing resources they are not even aware exists. Condor provided them with a single point

of access to computing resources scattered throughput their department, college, university or even the entire country.

For many experimental scientists, scientific progress and quality of research are strongly linked to computing throughput. In other words, most scientists are less concerned about the instantaneous performance of the environment (typically measured in Floating Point Operations per Second (FLOPS)). Instead, what matters to them is the amount of computing they can harness over a month or a year --- they measure the performance of a computing environment in

units of scenarios per day, wind patterns per week, instructions sets per month, or crystal configurations per year. Floating point operations per second has long been the principal metric used in High Performance Computing (HPC) efforts to evaluate their systems. The computing community has devoted little attention to environments that can deliver large amounts of processing capacity over long periods of time. We refer to such environments as High Throughput Computing (HTC) environments. We first introduced the distinction between HPC and HTC in a seminar at the NASA Goddard Flight Center in July of 1996 and a month later at the European Laboratory for Particle Physics (CERN). In June of 1997 HPCWire published an interview with M. Livny on High Throughput Computing. A detailed discussion of the salient characteristics of a HTC environment can be found in our contribution in "The Grid - Blueprint for a New Computing Infrastructure," I. Foster and C. Kesselman editors).

The key to HTC is effective management and exploitation of all available computing resources. Since the computing needs of most scientists can be satisfied these days by commodity CPUs and memory, high efficiency is not playing a major role in a HTC environment. The main challenge a typical HTC environment faces is how to maximize the amount of resources accessible to its customers. Distributed ownership of computing resources is the major obstacle such an environment has to overcome in order to expand the pool of resources it can draw from. Recent trends in the cost/performance ratio of computer hardware have placed the control (ownership) over powerful computing resources in the hands of individuals and small groups. These distributed owners are willing to include their resources in a HTC environment only when they are convinced that their needs will be addressed and their rights protected. The needs of these owners have been always a key factor in our research, development and implementation activity.

Condor is based on a novel approach to HTC that follows a layered Resource Allocation and Management software architecture. Depending on the characteristics of the application, the customer, and/or the environment, different layers are employed to provide a comprehensive set of Resource Management services. Condor was originally designed to operate in a Unix environment. For the last two years we have been engaged in an effort to port Condor to the NT environment. Our goal has been to develop a fully integrated HTC resource management system that can effectively harness the resources of computing environments that consists of both Unix and NT resources. We have recently added to Condor support for Symmetric Multi Processors (SMP) and interfaces to HPC batch schedulers.

The Condor project is an ongoing Research and Development project in which research in distributed resource management is performed in the framework of a team effort in which production software is development, deployment and supported. Scientists and engineers from a wide range of disciplines have employed Condor in their daily work and have been collaborating with us on porting their applications to benefit form the computing power offered by Condor and on enhancing the capabilities of Condor. Without these collaborations, Condor would have been a much less effective resource management system.
 
 
DiscWorld 

http://www.dhpc.adelaide.edu.au/projects/dhpci/index.html

This project is developing a middleware infrastructure for distributed high-performance computing applications.

The Distributed Information Systems Control World (DISCWorld) is a smart middleware system designed to integrate processing and storage resources across wide area heterogeneous networks, exploiting broadband communications where available.

Metacomputing has come to mean the ``integration of distributed computing resources so that a user connected to a single platform can enjoy the functionality and performance of the whole system with some degree of transparency''. The term implies more than distributed computing or clustered computing and often involves a ``meta-level'' of software above the individual operating systems of the component hosts that provides the glue to enable transparent access for users. The term metacomputing usually implies interactions across computing resources that would otherwise be uncoupled at the operating systems level, and often also implies interactions across wide areas. A number of metacomputing environments and software packages have been developed recently by other researchers, each addressing different aspects of the problem. Our Distributed Information Systems Control World (DISCWorld) system for wide-area, service-based, high-performance metacomputing, and review it in the context of the current research issues.

This project provides a framework for many of the threads of research in high performance and distributed computing that we are carrying out both in the DHPC Group and under the OLDA program of the ACSys CRC. The sub-projects include:

Parallel Computing and Cluster Development
Networks Evaluation and Benchmarking
Distributed Storage Systems Management Software

 
 

EveryWare

http://nws.npaci.edu/EveryWare/

A Toolkit for Building Adaptive Computational Grid Programs

The goal of performance-oriented distributed computing is to harness widely dispersed, independently controlled resources into a "Computational Grid" that supplies execution cycles the way a power company supplies electrical power. Since the resources are independently managed by their individual owners, it is not possible to assume that a uniform software infrastructure will be installed at every potential execution site. EveryWare is a set of software tools designed to allow an application to leverage resources opportunistically, based on the services they support. By using the EveryWare toolkit, a single application can combine simulatneously combine the useful features

offered by

Java applet technology

to form a temporary virtual Computational Grid for its own use. Each of these infrastructures offers services that can be combined by an application profitably. EveryWare allows the application to gauge and harness the power that is available to it at each potential execution site, based on the type of the resource at hand, and the software

infrastructure it supports.
 
 

EveryWare Components

Everyware consists of five services:

which are implemented to use the the advantageous features of what ever infrastructure is present. For example, if the Globus GRAM and GASS facilities are available, EveryWare uses them for process management. If Condor is present, EveryWare uses the submission interface to launch application processes. The low-level software primitives are written to be extremely portable so that new infrastructures and architectures can be incorporated quickly.

To gauge the performance value of a particular resource, an

EveryWare application can collect performance information

"on-the-fly" and use dynamic prediction techniques to pick a best resource and recognize emerging performance trends. These facilities work at the application level so that the overheads introduced by any resident infrastructure are considered.

Scheduling is a service provided by EveryWare in the form of ORAnGS. Application components my employ a scheduling service to organize themselves into autonomous groups. Unlike other schedulers, however, ORAnGS are servers rather than controllers. Applications request scheduling service from them as opposed to submitting themselves as slaves to a master scheduler.
 
 

EveryWare also allows application components to register themselves with a distributed, adaptive state synchronization facility. As state propagates through the system, components are notified of state updates automatically.
 
 

Why use EveryWare?

Computational Grids are dynamic environments. Not only does the performance vary as a function of resource load (due to contention) but resources are added and removed from the Grid continuously. It is, therefore, not possible to assume that a consistent Grid-wide infrastructure will be in place ubiquitously. New, experimental architectures will be available long before sophisticated Grid services are ported to them. Operating system and software revision levels are often incompatible. Resource owners may wish to temporarily contribute their resources, but not to install and maintain new infrastructure to do so. EveryWare allows a single application to combine low-level "vanilla" operating systems facilities (like those provided by Unix) with sophisticated and robust metacomputing facilities (Globus, Condor, Legion) and ultra-portable execution environments (Java). The goal is to allow the application to use a computational resource regardless of what infrastructure it supports.
 
 

Is EveryWare for Everyone?

An EveryWare application claims and discards resources dynamically, based on their perceived performance value. If ORAnGS supplies scheduling control, the application dynamically molds itself to the changing conditions. Not all application classes will be able to achieve high-performance levels using global resources. For those applications that require coarse-grained, loose synchronization and can migrate state efficiently, EveryWare is an appropriate option.
 
 

Globus: The Globus Grid Computing Toolkit

What is Globus? Globus is a research and development project that targets the software infrastructure required to develop usable high-performance wide area computing applications. Globus research targets key challenges that arise when developing innovative applications in wide area, multi-institutional environments, such as communication, scheduling, security, information, data access, and fault detection. Globus development is focused on the Globus toolkit, a set of services designed to support the development of innovative and high-performance networking applications. These services make it possible, for example, for applications to locate suitable computers in a network and then apply them to a particular problem, or to organize communications effectively in tele-immersion systems. Globus services are used both to develop higher-level tools (e.g., the CAVERNsoft teleimmersion system, the WebFlow problem solving environment) and directly in applications (e.g., collaborative data analysis).

What is the Globus Toolkit? The Globus toolkit provides services for security, resource location, resource allocation, resource management, information, communication, fault detection, network performance measurement, remote data access, and distributed code management. These services are implemented by a set of servers that run on (typically high-end) computing resources and client-side application programming interfaces (APIs). Globus services are designed to hide low-level details of resource characteristics (e.g., architecture, scheduler type, communication protocols, storage system architecture) while allowing applications to discover and exploit low-level information when this is important for performance.

How does this relate to seamless computing? Globus client-side APIs can be used to construct desktop applications that authenticate once, then locate resources in a network, determine properties of those resources, request allocations on those resources, transfer data among those resources, initiate computations on those resources, communicate with those computations, and monitor their execution. These desktop applications may be written in C or C++ or, thanks to the development of Java versions of many Globus APIs, in Java.

Who is involved in Globus? Globus research and development is centered at Argonne National Laboratory and the University of Southern California’s Information Sciences Institute, but also includes many partner institutions. Many partners form part of the Globus Ubiquitous Supercomputing Testbed Organization, an informal consortium of institutions interested in high-performance distributed computing.

For more information see www.globus.org or contact info@globus.org
 
 

GlobusJ

Globus J is a collection of reusable Java components which allow access to a selected set of Globus services including Job submission, remote file staging. A simple graph based editor allows creating a dataflow graph of jobs submitted to the Globus metacomputing toolkit. These components are not distributed with the Globus metacomputing toolkit due to their experimental nature. A small subset of components is expected to be released after SC98.
 
 

Habanero

Ed Grossman, Terry McLaren, Larry Jackson

jackson@ncsa.uiuc.edu tmclaren@ncsa.uiuc.edu egrossma@ncsa.uiuc.edu

NCSA Habanero is a Java-based framework for synchronous collaboration. Habanero makes it possible for scientists who are not physically co-located to work on projects together in real time. They may access, analyze, and visualize scientific data, discuss issues, view documents and images, and run scientific computation and analysis tools collaboratively. Habanero also provides the ability for instructors to conduct classes, run help sessions, or distribute pre-recorded classes for students to take over the Internet.

The Habanero package includes a server, a client, several collaborative applications (Hablets), and persistence support modules. When Habanero is running, the server component keeps track of the active collaborative sessions under its control, maintains information about active participants, tracks the active Hablets, and manages event traffic in the system, and enforces each Hablet's "rules of operation". All Habanero clients display the same thing to all participants in a given session, all at the same time (synchronously). Habanero sessions may be recorded by any participant and played back. Third party developers may construct Hablets for any purpose by using the Habanero Developer's API. Current and future development centers on

Persistence: Persistent sessions may be created, used, stored, searched, suspended, and resumed from their previous state. Participant information becomes persistent also. Both session and participant information may be accessed from a directory service (e.g., LDAP). New sessions can be constructed from subsets of existing sessions (branching) or created from the intersection of portions of existing sessions (merging).

Blended Asynchronous Collaboration: This work includes asynchronous session membership, event filtering to external destinations such as email and Automated Assistant Agents, event receipt and handling from approved external Automated Assistant Agents, and notification.

Collaborative Distributed Session Replay: This provides the capability to play back captured sessions from an active session.

Jini Services: This includes the construction of Jini Services and service framework useful to the NCSA Habanero project, and the extension of the Habanero framework to use the Jini services. Useful services include persistence, directory/lookup services, event monitoring with notification, and email/ftp/news services.

3rd Party Developer Aid: Applet-to-Hablet construction, hablet customizing, and hablet bean (Java Bean) construction wizards will be developed.
 
 

Harness: Dynamic Reconfiguration and Virtual Machine Management in the Harness Metacomputing System

Mauro Migliardi, Jack Dongarra, Al Geist, and Vaidy Sunderam

om@mathcs.emory.edu

Harness is an experimental metacomputing system based upon the principle of dynamically reconfigurable networked computing frameworks. Harness supports reconfiguration not only in terms of the computers and networks that comprise the virtual machine, but also in the capabilities of the VM itself. These characteristics may be modified under user control via a Java based on demand "plug-in" mechanism that is the central feature of the system.

The overall goals of the Harness project are to investigate and develop three key capabilities within the framework of a heterogeneous computing environment:

Early experience with small example programs show that our system is able:

IceT

Paul Gray, gray@mathcs.emory.edu

Math/Computer Science Department, Emory University, Atlanta, GA

150 North Decatur Building

1784 N. Decatur Rd.

Atlanta, GA 30322

http://www.mathcs.emory.edu/~gray,
http://www.mathcs.emory.edu/icet

At the present, the respective programming models, tools and environments associated with Internet programming and parallel, high-performance distributed computing have remained detached from one another. The focus of the IceT project has been to bring together the

common and unique attributes of these areas, the result of which is a confluence of technologies and a parallel programming environment with several novel characteristics. The resulting combination of technologies provides users with a parallel, multi-user, distributed programming environment; upon which processes and data are allowed to migrate and to be transfered throughout owned and unowned resources, under security measures imposed by owners of the local resources.

The IceT environment provides users the ability to dynamically configure a distributed environment to suit the specific needs of the application, not the other way around. Process management is provided using IceT's unique soft-installation abilities which give native codes significant Java-like portability and dynamism.

The traditional paradigms associated with message-passing have been extended in IceT to include object- and process-passing. As a result, the environment is able to support nomadic processes and is able to dynamically load components into the environment to support the particular needs of the application.

In IceT, individual environments of multiple users are able to be merged together to form larger resource pools, over which processes may be mutually installed using IceT's soft-installation mechanism. Several examples which have shown proof of concept include the soft-installation of C-based MPI processes on remote environments (along with the supporting MPI library calls) and the dynamic installation of Fortran-based PVM computations running cooperatively across distinct networks and filesystems, supported by IceT's process management and message-passing API.
 
 

JINI

http://java.sun.com/products/jini

Overview of the SUN Jini Technology & its Application in PSEs

Jini is a platform-independent, Java-based mechanism for providing services to clients over the network. Services are components for accessing and utilizing hardware and software

resources over a network. A service can correspond to a network device such as a computer or printer, a software resource such as a database, or a higher-level composite functionality.

Jini uses and extends the existing Java programming language base. It extends the scope of Java from a single machine to a network of machines.

Jini is a componentized system. Its components function outside the scope of Jini, but together, they provide a platform for distributed computing. The components include

The Jini Lookup Service, Discovery Protocols, and Schema together allow services to describe themselves, advertise their existence, and make themselves available for use over the network. Each Jini service registers with a Jini lookup service and provides a proxy. Later, when a client makes a service request, the Jini lookup service will determine whether it has a matching service available. If a match is found, the Jini lookup service sends a copy of the matching service's proxy to the client. The client then uses the proxy to access the service.

The Event handling interface extends the JavaBeans event model for use in a distributed computing environment.

Leasing allows resource allocation and other operations to occur on a timed basis. If the "timer" is allowed to expire, the resource is released, or the operation is cancelled.

The transaction interface ensures that either all of the operations in a given operation set occur atomically, or none of them occur. It uses a two-phase commit mechanism.

JavaSpaces supports the flow of objects between participants in a distributed system, and in doing so, supplies distributed object persistence. Clients of a JavaSpace export objects and events for other clients to use and consume as needed.

The Jini technology is potentially very useful in the construction of scientific workbenches and other problem solving environments (PSEs). First, Jini is runs on any platform that supports a Java VM. Second, Jini's service-based architecture can give PSEs great flexibility. A PSE could use any service it could discover and access, so its functionality is easily made dynamic. Then, services developed for one purpose could be used elsewhere, so redundant development is reduced. Finally, Jini could be used to make existing legacy programs visible and usable over the network.

A Java-based PSE framework could be developed that supports the development of PSE services and provides a bridge to other distributed computing environments such as Globus. A Jini-to-Globus bridge would allow the use of both Globus and Jini services in the same PSE. An workbench-service construction kit (Java packages) would provide abstractions of different kinds of workbench services and service components. The workbench construction kit would also include components for interfacing with non-Java code. And, the PSE framework would contain generic services useful in PSE development, such as Visualization, Notification, Sequencing/Scripting, Data Format Translation, Collaboration, and Persistence.
 
 

Llava

http://www.tc.cornell.edu/UserDoc/SP/Llava_info.html

Llava is a java applet used to monitor the status of the SP and the job queue through a secure interface. Llava was developed by Bill Stysiack and Dave Lifka. It is based on the X-Windows tool called Llama developed by Andy Pierce of IBM.

Ligature: Component Architecture for High-Performance Computing

Kate Keahey,  kate@acl.lanl.gov

http://www.acl.lanl.gov/ligature/

Ligature is ACL's new research project in component architecture development. The project has two goals: (1) to develop a set of abstractions and techniques for high-performance scientific component programming and (2) to provide and process performance information within that environmetn to enable performance-guided design.

Legion System Architecture

http://www.cs.virginia.edu/~legion

Legion is an object-based system that empowers classes and metaclasses with system-level responsibility.

Philosophy

Legion users will require a wide range of services in many different dimensions, including security, performance, and functionality. No single policy or static set of policies will satisfy every user, so, whenever possible, users must be able to decide what trade-offs are necessary and desirable. Several characteristics of the Legion architecture reflect and support this philosophy.

The Model

Legion objects are independent, logically address- space-disjoint active objects that communicate with one another via non-blocking method calls that may be accepted in any order by the called object. Each method has a signature that describes the parameters and return value, if any, of the method. The complete set of method signatures for an object fully describes that object’s interface, which is determined by its class. Legion class interfaces can be described in an interface description language (IDL), several of which will be supported by Legion.

Legion implements a three-level naming system. At the highest level, users refer to objects using human- readable strings, called context names. Context objects map context names to LOIDs (Legion object identifiers), which are location-independent identifiers that include an RSA public key. Since they are location independent, LOIDs by themselves are insufficient for communication; therefore, a LOID is mapped to an LOA (Legion object address) for communication. An LOA is a physical address (or set of addresses in the case of a replicated object) that contains information to allow other objects to communicate with the object (e.g., an <IP address, port number> pair).

Legion will contain too many objects to simultaneously represent all of them as active processes. Therefore, Legion requires a strategy for maintaining and managing the representations of these objects on persistent storage. A Legion object can be in one of two different states, active or inert. An inert object is represented by an OPR (object persistent representation), which is a set of associated bytes that exists in stable storage somewhere in the Legion system. The OPR contains state information that enables the object to move to an active state. An active object runs as a process that is ready to accept member function invocations; an active object’s state is typically maintained in the address space of the process (although this is not strictly necessary).

Core objects

Several core object types implement the basic system-level mechanisms required by all Legion objects. Like classes and metaclasses, core objects are replaceable system components; users (and in some cases resource controllers) can select or implement appropriate core objects.

Summary

Legion specifies functionality and interfaces, not implementations. Legion 1.4 provides useful default implementations of class objects and of all the core system objects, but users are never required to use our implementations. In particular, users can select (or build their own) class objects, which are empowered by the object model to select or implement system-level services. This feature of the system enables object services (e.g. creation, scheduling, security) to be appropriate for the object types on which they operate, and eliminates Legion’s dependence on a single implementation for its success.

This work is partially supported by DARPA (Navy) contract #N66001 96- C-8527, DOE grant DE-FD02-96ER25290, DOE contract Sandia LD- 9391, Northrup-Grumman (for the DoD HPCMOP/PET program), DOE D459000-16-3C and DARPA (GA) SC H607305A.

Ninja

http://ninja.cs.berkeley.edu

M. Welsh, mdw@cs.berkeley.edu

The Ninja project aims to develop a software infrastructure to support the next generation of Internet-based applications. Central to the Ninja approach is the concept of a service, an Internet-accessible application (or set of applications) which is scalable (able to support many thousands of concurrent users), fault-tolerant (able to mask faults in the underlying server hardware), and highly-available (resilient to network and hardware outages). Examples of current and future Ninja services include an Internet stock-trading system; a ``universal inbox'' used to access e-mail, voice mail, pages, and other personal correspondence; and the Ninja Jukebox, which provides real-time streaming audio data served from a collection of music CDs scattered about the network.

In some sense, the current World Wide Web is a service itself; Ninja intends to build upon and expand the notion of Web-based services by providing composability (the ability to automatically aggregate multiple services together into a single entity), customizability (the ability for users to inject code into the system to customize a service's behavior), and accessibility (the ability to access the service from a wide range of devices, including PCs, workstations, cellphones, and Personal Digital Assistants). The end goal of the Ninja project is to enable the development of a menagerie of Internet-based services which are interoperable and immediately accessible across the spectrum of user devices ranging from PCs and workstations to cellphones and Personal Digital Assistants. For example, one should be able to check one's e-mail simply by calling a special number from a cellphone, or equivalently by sitting down at any Internet-connected PC
in the world.

Ninja was initiated at the UC Berkeley Computer Science Division in Spring of 1997, and is currently in the process of developing and prototyping the architecture. Several software packages and demonstrations have been released, as well as several publications and presentations.

For more information about the project, please see the Ninja Project Website.

NCSA Workbench Project

Mary Pietrowicz, Duane Searsmith, Ed Grossman, Larry Jackson

maryp@ncsa.uiuc.edu, d-sears1@uiuc.edu, egrossma@ncsa.uiuc.edu, jackson@ncsa.uiuc.edu

The NCSA Workbench project proposes to provide a system that is capable of specifying, finding, and running useful software components and services from the desktop. Our proposed solution will utilize Java-based object encapsulation technology developed by SUN (JavaBeans and Jini) to provide a platform-independent, consistent Java interface to a time-varying set of resources and services on the Grid.

The Jini technology works as follows:

  1. A Jini service registers with a Jini Lookup Service and provides a proxy.
  2. When a client requests a service, the Lookup Service returns the proxy.
  3. The client uses the proxy to communicate with the service.

Jini is a componentized system. Its components function outside the scope of Jini, but together, the parts form a platform for distributed computing. Its main components include Jini Lookup & Discovery, Jini Schema, Distributed Events, Distributed Leasing, Distributed Transactions, and JavaSpaces.

NCSA Workbenches currently are implemented with Web Servers and CGI scripts. While CGI technology has been very helpful in making scientific computation codes available via the web, CGI is not a dynamic, object-oriented approach to the problem. Each individual workbench is static, and the scripts are hand-maintained. The CGI approach does not support networked services and does not allow a convenient mechanism of sharing resources and services among different workbenches. Jini can help provide dynamic access to resources over the network.

We plan to build an interface to Globus (front-end extensions to the Globus system), workbench framework that is needed to support the needed services, and a set of generic services useful in the development of workbenches. Then, we plan to use the resulting framework and services in one or more testbeds.

The purpose of the Jini-Globus interface is to provide platform-independent access to Globus services and resources from the desktop. A bridge between the Jini Lookup Service and the Globus resource management subsystem will allow Globus services to register with Jini. Then, a desktop client can request a service from Jini, and the proxy returned will interact with Globus to provide the service. Globus will require front-end APIs so that Jini proxies (written in Java) can communicate with Globus services easily. A Globus interface toolkit will provide a generic class hierarchy for building interfaces to Globus services. We will explore using this toolkit to build specific interfaces to other existing Globus services such as Security, Information, Health and Status, etc. The Workbench framework would provide abstractions of different kinds of workbench services and service components. The framework would include components for interfacing with non-Java code and would contain generic services useful in the development of workbenches. Some of the services we plan to explore include visualization, notification, sequencing/planning, data format translation, collaboration, and persistence.

This project is in the early stages and its continuation is contingent upon the availability of sufficient funding.

Netsolve

http://www.cs.utk.edu/netsolve/

NetSolve is a client-server application that enables users to solve complex scientific problem remotely. The system allows users to access both hardware and software computational resources distributed across a network. NetSolve searches for computational resources on a network, chooses the best one available, and using retry for fault-tolerance solves a problem,
and returns the answers to the user. A load-balancing policy is used by the NetSolve system to ensure good performance by enabling the system to use the computational resources available as efficiently as possible.

Some goals of the NetSolve project are:

ease-of-use for the user
efficient use of the resources, and
the ability to integrate any arbitrary software component as a resource into the NetSolve
system.

Interfaces in Fortran, C, Matlab, and Java have been designed and implemented which enable users to access and use NetSolve more easily. An agent based design has been implemented to ensure efficient use of system resources.

One of the key characteristics of any software system is versatility. In order to ensure the success of NetSolve, the system has been designed to incorporate any piece of software with relative ease. There are no restrictions on the type of software that can be integrated into the system.

Northrhine-Westphalian Metacomputer Taskforce

Joern Gehring, joern@uni-paderborn.de

http://www.uni-paderborn.de/pc2/nrw-mc/ (currently mostly in german)

Abstract

Combining high-speed networks and high-performance supercomputers offers great potential for new solutions both in research and development. The concept of distributed supercomputing (metacomputing) can be employed to combine forces in order to improve competitiveness of industry and research. However, there is still much to be done and many new concepts have to be invented for a working metacomputer. Therefore, in 1995 eight research

institutions in Northrhine-Westphalia have founded the metacomputer task force in order to develop a county-wide metacomputer. Since 1996 the ministry of science and research supports the group to achieve its goal of providing the power of connected supercomputers to a broad variety of researches and developers.

One Page Description:

Multimedia, telecommunication, and metacomputing are ey-technologies of the future. Their availability will be critical for the competitiveness of local research and industry. These technologies represent a general trend of synergy between communication, information technology (hard- and software), entertainment, and mass media. The concept of Metacomputing continues the idea of making availability of resources independent from the consumers location. It offers the chance to use the available hardware more economically and to extend accessibility to a wider range of users. The Northrhine-Westphalian Metacomputing Task force was founded in 1995 to support ongoing work and to find new solutions in this promising field. Up to know, the consortium consists of the universities of Aachen, Bochum, Dortmund, Cologne, and Paderborn plus the research institutes GMD (St. Augustin), DLR (Cologne), and FZJ (Julich). The group focuses on cooperative, multi-site, and multi-location research on the field of metacomputing within the county of Northrhine-Westphalia (NRW). We expect major advances in research through the combined efforts of researches from engineering, natural science, mathematics, and computer science.

Among the main tasks of the initiative are:

We explicitly emphasize on the idea of coordinated simultaneous use of the available computer hardware. This means, that the NRW-metacomputer not only provides a

Homogeneous user interface for all available resources, but also supports scheduling and execution of multi-site applications. In 1996 five projects have been started to work on this goal. Each project is a cooperation of at least two partners and regular meetings were held to ensure smooth fitting of the developed parts.

The core project aims at developing an infrastructure for high-performance computer management (HPCM). This is the skeleton of the metacomputer that links its subcomponents together. The fundamental idea of HPCM architecture is to define a kind of a three-tier model, consisting of: a graphical user interface, one or more management daemons, and coupling modules on the compute servers

Access to the metacomputer is established by a lightweight client based on a WWW browser. The Java applet loaded while accessing the metacomputer's HTTP address is generating a context specific graphical interface by using user related informations, e.g. his authorization, or the state of his transactions.

The central component of the HPCM architecture is the management daemon that creates the view of a virtual computer. This demon executes on the HPCM server machines with the task to receive the user's requests from the Java applet, to interpret their semantics, and to submit them to the related computing instances. To implement this functionality, the demon needs a global view of the metacomputer's components, i.e. joined users and compute servers. Consequently the MD is the administrative instance of the metacomputer which is responsible to maintain the

metacomputer related data. Due to the fact that the management demon is accessing resources on behalf of the user, there is a need for a security policy. The authentication scheme must also support a single sign-on environment, to encapsulate the computing instances from the user.

A coupling module interfaces with each compute server. The operating system of each participating platform needs to be adapted to the specified communication protocol and its semantics. This is done as a daemon process on the compute server. It translates the abstract management daemon requests to the underlying management software, e.g. NQS, NQE, DQS, CCS, LSF or CODINE. The advantage of this architecture is the high degree of system independence.

The infrastructure developed in HPCM is embedded in a DCE/DFS environment which has been defined and setup by another project. This environment tackles the problems arising with increasing security demands and distributed management of a large number of metacomputer-users and -projects. Additionally it ensures that input and output data of metacomputer jobs can move freely between the participating sites without having the users to take extra care of the current location of their data. The architecture of this system was designed according to the results of a preceding feasibility study that was performed as a stand-alone project.

Efficient use of the available resources is ensured by a dedicated scheduling module that has been developed in the fourth project. This scheduler cooperates with the HPCM

infrastructure for determining best resources and execution time for every job that is submitted to the metacomputer. A dedicated resource description language and a special query syntax have been defined in order to allow complex resource requests to be optimally fulfilled by the metacomputer.

The progress and quality of the developed software is continuously verified by a couple of end-user applications that form the fifth project. These are real world applications with

Different communication demands. Therefore we can assure that the metacomputer is no academical experiment but meets the needs of a wide variety of users.

References:

A. Reinefeld and J. Gehring and M. Brune, "Heterogeneous Message Passing and a Link to Ressource Management", Journal of Supercomputing Special Issue, Vol 11, 1997

V. Sander, "A Metacomputer Architecture Based on Cooperative Ressource Management", HPCN 97, Vienna, Austria V. Sander, "High-Performance Computer Management", HPCN '98, Amsterdam, The Netherlands

D. Paschek and A. Geiger, "Molecular Dynamics Simulations of Liquid Crystals: Examples, Perspectives, Limitations", Symposium on Freestanding Smetic Films 97, Paderborn, Germany
 
 

PAWS 

http://acts.nersc.gov/paws/main.html

PAWS (Parallel Application WorkSpace) provides a framework for coupling parallel applications. PAWS is still under active development, but initial versions are in use by the developers and by projects with ties to the development team. Central to the design of PAWS is the coupling of parallel applications using parallel communication channels. The coupled applications can be running on different machines, and the data structures in each coupled component can have different parallel distributions. PAWS is able to carry out the communication without having to resort to global gather/scatter operations. Instead, point-to-point transfers are performed in which each node sends segments of data to remote nodes directly and in parallel. Furthermore, PAWS uses Nexus to permit communication across heterogeneous architectures. On platforms that support multiple communication protocols, it will attempt to select the fastest available node-to-node protocol. The PAWS framework provides both an interface library for use in component applications and a PAWS "controller" that coordinates the interaction of components. The PAWS controller organizes and registers each of the applications participating in the framework. Through the controller, component applications ("tasks") register the data structures that should be shared with other components. Tasks are created and connectiobns are established between registered data structures via the script interface of the controller. The controller provides for dynamic data connection and disconnection, so that applications can be launched independently of both one another and the controller. PAWS supports C, C++, and Fortran interfaces for applications connecting to the PAWS framework.

PAWS provides basic data classes for data transfer, including arrays of arbitrary dimension which can be distributed in a broad range of patterns. It also allows for the transfer of user defined data types through the provision of function callbacks for data encoding and decoding on each end.

Because of the generality of the PAWS parallel data transfer interface, it is possible to connect existing parallel applications, parallel data post-processors, and parallel visualization software with minimal coding. The PAWS software package consists of the following:

  1. A PAWS Controller (binary executable).
  2. A set of PAWS data transfer types.
  3. A PAWS data transfer library based on Nexus.
  4. A PAWS Application interface library for communicating with the controller

and other applications.

PAWS components may be scripted together with Tcl to allow user control of multiple applications, including their initialization, inter-connection, and termination.

Who Uses PAWS?

PAWS has not yet been publically released, and has only been used by projects working dirrectly with the developers. Within that scope, however, it has already found use in real applications.

 

 

PARDIS

http://www.cs.indiana.edu/hyplan/kksiazek/pardis.html

PARDIS is an environment providing support for building PARallel DIStributed applications. It employs the key idea of the Common Object Request Broker Architecture (CORBA) --- interoperability through meta-language interfaces --- to implement application-level interaction of heterogeneous parallel components in a distributed environment. Addressing interoperability at this level allows the programmer to build metapplications from independently developed and tested components. This approach allows for a high level of component reusability and does not require the components to be reimplemented. Further, it allows PARDIS to take advantage of application-level information, such as distribution of data structures in a data-parallel program.

PARDIS builds on CORBA in that it allows the programmer to construct metapplications without concern for component location, heterogeneity of component resources, or data translation and marshaling in communication between them. However, PARDIS extends the CORBA object model by introducing SPMD objects representing data-parallel computations; these objects are implemented as a collaboration of computing threads capable of directly interacting with PARDIS Object Request Broker (ORB) --- the entity responsible for brokering requests between clients and servers. This capability ensures request delivery to all the computing threads of a parallel application and allows the ORB to transfer distributed arguments directly (if possible in parallel) between the client and the server. Single objects, always associated with only one computing thread, are also supported and can be collocated with SPMD objects. In addition, PARDIS contains programming support for concurrency by allowing non-blocking invocation returning distributed or non-distributed futures, and allowing asynchronous processing on the server's side.

Publications:

  1. Katarzyna Keahey and Dennis Gannon, Developing and Evaluating Abstractions for Distributed Supercomputing, Cluster Computing, May 1998, vol. 1, no 1.
  2. Katarzyna Keahey and Dennis Gannon, PARDIS: CORBA-based Architecture for Application-Level PARallel DIStributed Computation, Proceedings of Supercomputing '97, November 1997.
  3. Katarzyna Keahey and Dennis Gannon, PARDIS: A Parallel Approach to CORBA, Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing (best paper award), August 1997.

 

POEMS

http://www.cs.utexas.edu/users/poems/poems.html

Jim Browne - University of Texas at Austin
Vikram Adve - Rice University
Rajive Bagrodia - University of California at Los Angeles
Elias Houstis - Purdue University
Olaf Lubeck - Los Alamos National Laboratory
Pat Teller - University of Texas at El Paso
Mary Vernon - University of Wisconsin-Madison

The POEMS project will create and demonstrate a capability for prediction of the end-to-end performance of parallel/distributed implementations of large scale adaptive applications. POEMS modeling capability will span applications, operating systems including parallel I/O, and architecture. Effort will focus on the areas where there is little convention wisdom such execution behaviors of adaptive algorithms on multi-level memory hierarchies and parallel I/O operations.

POEMS will provide:

A language for composing models from component models. Derivation of models of applications as data flow graphs from HPF programs,

POEMS development will be driven by modeling a full-scale LANL ASCI application code executing on an ASCI architecture.

This version of POEMS focuses on high performance computational applications and architectures. But POEMS technology can be applied to other large complex dynamic computer/communication systems as as GloMo and Quorum.

Sweb -- The Virtual Workshop Companion: A Web Interface for Online Labs

http://www.tc.cornell.edu/~susan/webnet98/

Susan Mehringer, Cornell Theory Center, Cornell University, Ithaca, NY USA, susan@tc.cornell.edu
David Lifka, Cornell Theory Center, Cornell University Ithaca, NY USA, lifka@tc.cornell.edu

Abstract: The Virtual Workshop is a Web-based set of modules on high performance computing.
One interesting new technique we have been refining over the past year is to securely issue lab
exercise commands directly through the web page. This paper describes the interface itself, how
this technique is incorporated into online lab exercises, participant evaluation, and future plans.

UCLA Java Effort For Collaborative Computing

UCLA Department of Mathematics

CAM (Computational and Applied Mathematics)

Chris Anderson (http://www.math.ucla.edu/~anderson)

Purpose of the effort: Develop infrastructure to support collaborative research computing

Goals:

Approach:

We are using Java to create an infrastructure for distributed computing as well as to construct tools that assist in creating wrappers for C++ and Fortran based components.

Results:

Reports/documents associated with our use of Java for scientific/technical computing are available electronically at http://www.math.ucla.edu/~anderson/JAVAclass/JavaSTC.html

Software is available at http://www.math.ucla.edu/~anderson/JAVAclass/CAMJava.html

Teraweb

http://www.arc.umn.edu/structure/

No information as of yet. Please follow the WWW link or contact  Joel Neisen <neisen@networkcs.com>

UNICORE: Uniform Access to Computing Resources

http://www.fz-juelich.de/unicore

e-mail: D.Erwin@fz-juelich.de

UNICORE functions: UNICORE is designed to provide seamless and secure access to distributed computing resources using the World Wide Web. It offers the following

The UNICORE architecture: UNICORE has a three tier architecture.

A web browser on the user’s Unix or Windows desktop supporting Java Applets and X.509 certificates constitutes tier one. A signed applet, the Job Preparation Agent (JPA), creates an Abstract Job Object (AJO).

Each UNICORE site operates a gateway running an https server and a security servlet where the user’s certificate is authenticated and mapped to the local Unix userid. Sites may specify an additional Security Object to meet more stringent security requirements. The AJO is interpreted by a Network Job Supervisor (NJS) which incarnates the job for the local Resource Management System (RMS) or forwards sub-jobs to other UNICORE sites. NJS synchronizes the execution of interdependent jobs at multiple sites and manages data transfers between sites.

Tier three, the Resource Management Systems at each execution server and the local user administration remains unchanged.

The UNICORE job model: The Abstract Job Object (AJO) is the basis for the uniform specification of a job as a collection of interdependent operations to be carried out on different platforms at collaborating sites. The object oriented structure of the AJO provides a specification which is independent of hardware architecture, system interfaces, and site specific policies. The object oriented design makes the AJO fully extensible to incorporate additional functions over time. It also allows for multiple implementations of UNICORE.

The UNICORE project and partners: UNICORE is funded as a two year project by the German Ministry for Education and Research (BMBF). The target is to develop a prototype by July 1999. The project is structured in three overlapping phases: Phase one defines the architecture and implements job creation and submission to one UNICORE site and one execution platform. Phase two implements job control and data staging and adds support for additional execution platforms. Phase three includes distribution of parts of a jobs to different sites, the synchronization of the interdependent jobs and the automatic transfer of data between sites. Phase one has been successfully completed and demonstrated.
 

The implementation of UNICORE is primarily done by two German software companies Genias GmbH and Pallas GmbH. German HPC centers and leading German Universities together with the German weather bureau (DWD), ECMWF and fecit in the UK, are responsible for design, test, and operation of UNICORE. Major HPC vendors, Fujitsu, IBM NEC, SNI, SGI support UNICORE and provide know-how and manpower. Hitachi and Sun are joining the UNICORE consortium.
 
 

WebSubmit: A Web Interface to Remote High-Performance Computing Resources

Ryan McCormack, John Koontz, and Judith Devaney

Information Technology Laboratory

National Institute of Standards and Technology

{john.koontz, judith.devaney, ryan.mccormack}@nist.gov

http://www.itl.nist.gov/div895/sasg/

http://www.itl.nist.gov/div895/sasg/websubmit/websubmit.html

WebSubmit is a Web browser-based interface to heterogeneous collections of remote high-performance computing resources. It makes these resources easier to use by replacing a constantly changing range of unfamiliar, command-driven queuing systems and application environments, with a single, seamless graphical user interface based on HTML forms. To do this WebSubmit provides what is essentially a Web-based implementation of telnet: direct access over the Web to the individual users' accounts on

the systems in question.

To secure these accesses against unauthorized use we use a strong form of authentication based on the Secure Sockets Layer (SSL) protocol and Digital Certificates. This lets certified users registered with a WebSubmit server connect to the server over SSL encrypted lines, and, when authenticated by the server, interact with a set of application modules. Each of these application module is implemented as an HTML form, generated dynamically with a CGI script. The application module form is filled out and submitted to the WebSubmit server, where it is processed by a second CGI script. This second script processes the request and executes the desired tasks in the user's account on the specified remote system. For this it uses the Secure Shell protocol, which provides an encrypted line from the server to the user's account on the remote system. The processing script also generates a Web page reporting the results of the immediate interaction. However, if the task was to submit a job, the user must use additional modules to follow the progress of the job and retrieve its output.

The combined system is secure, flexible, and extensible. It does not require the Web server to be located on the high-performance computing resources. Its modularity facilitates developing and maintaining interfaces. And, since the application modules for similar tasks on different computing resources use the same interface (as much as possible), they present a uniform aspect that facilitate usage.

Two different classes of application modules are currently available under WebSubmit:

The system-independent modules available are a UNIX command facility, a simple file editor, and an interface for file transfer. System-dependent application modules are currently available for three major batch queuing systems (LoadLeveler, LSF, and NQS). The more generic modules allow users to submit jobs and monitor them. There are also some specialized submission modules for some of these systems. These support particular kinds of submitted jobs, e.g., MPI jobs or runs with the commercial Gaussian application.

To aid users in repeating tasks and deriving new ones, we provide a session library facility that allows users to save their module parameters in named sets. There is also an on-line help facility to provide explanations of elements in each module's interface.

Although our application is access to high-performance computing resources, plainly WebSubmit can be adapted to other forms of Web-based remote execution of user-owned tasks.

WebSubmit is written in the open source scripting language Tcl, by Scriptics. Our own Tcl code consists of the set of CGI scripts plus a collection of supporting libraries. We also use the cgi.tcl library, by NIST's Don Libes. We currently support execution from Unix-based browsers to Unix computing resources.
 
 

WebFlow: Web Based Metacomputing

Tomasz Haupt, Erol Akarsu, Geoffrey Fox

Northeast Parallel Architectures Center at Syracuse University

Programming tools that are simultaneously sustainable, highly functional, robust and easy to use have been hard to come by in the HPCC arena. Following our High Performance commodity computing (HPcc) strategy that builds HPCC programming tools on top of the new software infrastructure being developed for the commercial web and distributed object areas. This leverage of a huge industry investment naturally delivers tools with the desired properties with the one (albeit critical) exception that high performance is not guaranteed. Our approach automatically gives the user access to the full range of commercial capabilities (e.g. databases and compute servers), pervasive access from all platforms and natural incremental enhancement as the industry software juggernaut continues to deliver software systems of rapidly increasing power. We add high performance to commodity systems using multi-tier architecture with GLOBUS metacomputing toolkit as the backend of a middle tier of commodity web and object servers.

More specifically, we developed WebFlow - a platform independent, three-tier system. The visual authoring tools implemented in the front end integrated with the middle tier network of servers based on the industry standards and following distributed object paradigm, facilitate seamless integration of commodity software components. The high performance backend tier is implemented using GLOBUS toolkit: we use MDS (metacomputing directory services) to identify resources, GRAM (globus resource allocation manager) to allocate resources including mutual, SSL based authentication, and GASS (global access to secondary storage) for a high performance data transfer. Conversely, the WebFlow can be regarded as a high level, visual user interface and job broker for GLOBUS.

The visual HPDC framework delivered by this project offers an intuitive Web browser based interface and a uniform point of interactive control for a variety of computational modules and applications, running at various places on different platforms and networks. New applications can be composed dynamically from reusable components just by clicking on visual module icons, dragging them into the active WebFlow editor area, and linking by drawing the required connection lines. The modules are executed using GLOBUS optimized components combined with the pervasive commodity services where native high performance versions are not available. For instance today one links GLOBUS controlled programs to WebFlow (Java connected) Windows NT and database executables. This not only makes construction of a meta-application much easier task for an end user, but also allows combining this state of art HPCC environment with commercial software, including packages available only on Intel-based personal computers.

Our prototype WebFlow system is given by a mesh of Java enhanced Web Servers [Apache], running servlets that manage and coordinate distributed computation. This management is currently implemented in terms of the three servlets: Session Manager, Module Manager, and Connection Manager. These servlets are URL addressable and can offer dynamic information about their services and current state. Each of them can also communicate with each other through sockets. Servlets are persistent and application independent. Although our prototype implementation of the WebFlow proved to be very successful, we are not satisfied with this to large extend custom solution. Pursuing HPcc goals, we would prefer to base our implementation on the emerging standards for distributed objects, and takes the full advantage of the possible leverages realized by employing commercial technologies. Consequently, we plan to adopt CORBA as the base distributed object model.

This short note describes only the major design features of the WebFlow. For more detailed information, please refer to the WebFlow home page at

http://www.npac.syr.edu/users/haupt/WebFlow/demo.html