White Paper in (postscript).
The goal of the WARMstones project was to allow realistic comparisons between scheduling algorithms for wide-area distributed systems (also known as metasystems or grid systems). Heretofore, researchers working on scheduling algorithms have used simple, often unrealistic, models of distributed systems and have primarily limited their validation to theoretical (" big O") analysis.
WARMstones will build on technology developed in earlier research projects, including the Legion and MESSIAHS systems. Our approach will be simulation based, with a mapping generator feeding a core simulation engine.
The mapping generator will take in a set of program representations (a benchmark suite; the cardinality of the set might be 1 for benchmarking a single application) and a suite of system descriptions (again, the cardinality of the set may be 1), and a toolkit-based description of the scheduling algorithm. The mapping generator will then execute the scheduling algorithm on each combination of (program, system) from the input data, producing one mapping for each such pair.
These mappings will be passed to the core simulator, which will simulate the execution of the application on the system. This is an embarrassingly parallel step, and at a minimum each instantiation of the simulator can be run on a separate processor. We will be investigating functional decomposition of the simulator, through the incorporation of existing simulation packages. This approach should yield the benefits of higher levels of fidelity, increased concurrency, and the ability to build on earlier work.
The Interface-Based Intrustion Detection (IBID) project examined purely-local intrusion detection and network filtering. We built a rule-based system based on the Berkeley Packet Filter (BPF) language, and built a virtual network interface card to test the filtering on dual-processor machines, with one processor dedicated to network I/O and intrusion detection (i.e. the functionality that would go into the NIC).
The HOSS project is building operating systems software for clusters and heterogeneous distributed systems. For a quick introduction, click here. We will be implementing the software on two clusters, the Centurion cluster at the University of Virginia, and the Orange Grove cluster at Syracuse University.
This work ties into past projects on distributed resource management. The current state-of-the-art for grid scheduling allocates processing resources, but does not adequately schedule communications resources. It is common to simply assume that the required resources are available (the Network Weather Service is a project that monitors available resources, but does not provide allocation mechanisms).
Therefore, we need to provide mechanisms to allocate network resources taking into account quality-of-service (QoS) considerations, such as bandwidth, latency, or error and jitter rates. Routers from companies such as Lucent, Avici, Juniper, and Cisco will make allocations of network resources, so we need to build software to take advantage of these capabilities and make them available to higher software layers, such as metacomputing systems (e.g., Legion and Globus).