Dynamic Load Balancing in Heterogeneous Network of Multiprocessing and Multicomputer Systems |
Objective:
The objective of this research is to develop new scheduling techniques for high performance computing applications on a network of heterogeneous systems composed of workstations and parallel machines. The performance enhancement and speed-ups achieved by these techniques will be measured using The KIVA-3, a transient, multidimensional, arbitrary-mesh, and finite difference CFD program, used for internal combustion engine simulations.
Description:
Due to the tremendous decrease of coast/performance ratio of the workstations and the increase of their capabilities, the idea of clustering was born to gather their computational power to build low coast parallel machines. Clustering as defined by Pfister [1] is "a parallel or distributed system that consists of a collection of interconnected whole computers, that is utilized as a single unified computing resource"
The key to a good clustering is a suit of scheduling algorithms that may both minimize the execution time of a certain job and maximize the utilization of the system resources by load balancing the cluster. This suit of scheduling algorithms accepts the jobs submitted by the cluster users and allocate the necessary resources to these jobs. In order to achieve its goals the scheduler must keep track of all cluster resources, their capabilities and their utilization. It has as well to support both parallel and non parallel jobs. And must be able to work on a heterogeneous cluster composed of both small workstations and large parallel machines.
Network Configuration:
Progress:
We chose to start by developing techniques for parallel loop scheduling on a shared memory parallel machine and compare this technique with the already existing ones [2] [3]. The Fluid dynamics simulation software, KIVA-3, will be used for the purpose of testing and performance evaluation and comparison bet different scheduling techniques.
The following steps have been done:
- KIVA has been compiled and runs parallel on this Machine.
- Investigation of previously developed techniques has been done.
- The development of an algorithm for dynamic scheduling of parallel loops on Origin 2000 has been developed and is being implemented.
Glossary:
Multiprocessing System | A system having more than one processor and that is able to of distributing tasks to run parallel among them. |
Multicomputer system | A system composed of more than one computer each has one or more processor and is able distributing tasks to run parallel among them. |
Multitasking | The execution of multiple applications or threads to provide simultaneous execution. |
Multithreading | Multitasking within threads of the same process or application. |
Process | One or more threads and the resources allocated to them including virtual address space. Each instance of any application runs within its own process. |
Heterogeneous Network | Network composed of different kinds of computers having different platforms and architectures. |
Thread | a single path or entity of execution within one process. |
References:
[1] G. Pfister, In Search of clusters (Prentice-Hall, Inc., 1995) page 72.
[2] S. F. Humell, E. Schonberg and L. E. Flynn," Factoring a method for scheduling parallel loops" Communications of the ACM, August 1992.
[3] Ten H. Tzen and Lionel M. Ni, IEEE Transactions on Parallel and Distributed Systems, January 1993.