04 Mar 2021
14:30

via Zoom

PhD Defense Computer Science: Ahmed Hamdy Mohamed Eleliemy

Multilevel Scheduling of Computations on Parallel Large-scale Systems

Scientists are eager to utilize computing resources to execute scientific applications to advance the understanding of various complex phenomena. This eagerness drives rapid technological developments in high performance computing (HPC). Modern HPC systems exhibit rapid growth in the number of cores per computing node and the number of computing nodes per system. As such, modern HPC systems offer additional levels of hardware parallelism at the core, node, and system levels. Each level requires and employs techniques for appropriate scheduling of the computational work at the respective level. These scheduling techniques work separately without coordination. Hence, each technique is designed to achieve specific performance targets. Currently, the absence of coordination between schedulers at different levels is an open research problem. In many cases, independent scheduling decisions degrade applications’ performance and signify inefficient resource usage of contemporary HPC systems. To solve this problem, this doctoral dissertation formulates an important research question: How can scheduling exploit the multilevel of hardware parallelism of modern HPC systems to enhance scientific applications’ performance and increase utilization of HPC resources?

Understanding the relation between different scheduling levels is crucial for solving the aforementioned research question. However, it is challenging due to (1) the absence of methods, models, and tools to examine and analyze the interaction and the mutual impact of these scheduling levels, and (2) the different nature and performance targets of each of these scheduling levels. This doctoral dissertation addresses these challenges in the context of two specific HPC scheduling classes: queuing-based job scheduling at the batch level and dynamic loop self-scheduling (DLS) at the application level. One of the main contributions of this doctoral dissertation is proposing and evaluating a multilevel scheduling (MLS) prototype that effectively solves the problem by bridging the schedulers at different scheduling levels. The MLS prototype aims to (1) decrease applications’ execution time and (2) increase system utilization. The MLS prototype employs two novel scheduling approaches introduced in this doctoral dissertation. The first approach is the distributed chunk calculation approach (DCA) and its hierarchical version (HDCA). The second approach is the resourceful coordination approach (RCA).

At the application level, the DCA and HDCA address the scalability challenge of existing DLS implementations. We apply DCA and HDCA to several state-of-the-art DLS techniques, and show how they benefit applications’ execution time; therefore, fulfilling the MLS prototype’s first target. At the batch level, the RCA enables application schedulers to share their allocated but idle computing resources with other applications through a batch system. The significance of RCA is that it leverages and combines the advantages of node sharing and dynamic resource management. It offers efficient resource sharing and avoids shrinkage and expansion operations at the application side. RCA allows batch systems to reassign computing resources once they become free; therefore, fulfilling the MLS prototype’s second target. By employing DCA and RCA, the MLS prototype answers the aforementioned research question and shows a creative and useful way for exploiting the multilevel parallelism of modern HPC systems through scheduling.

This doctoral dissertation advances the state-of-the-art by demonstrating the usefulness and the performance potential of coordinated scheduling decisions at different levels. Moreover, this doctoral dissertation introduces certain methods and tools that allow the HPC community to analyze the mutual impact of scheduling decisions at different scheduling levels.

The talk is open to members of the university.

For the access link, please contact Prof. Florina Ciorba (florina.ciorba@unibas.ch).


Export event as iCal