Workflow Scheduling Algorithms in the Grid

Thumbnail Image
Dong, Fangpeng
Grid , Scheduling Algorithm , Workflow , DAG , Rescheduling
The development of wide-area networks and the availability of powerful computers as low-cost commodity components are changing the face of computation. These progresses in technology make it possible to utilize geographically distributed resources in multiple owner domains to solve large-scale problems in science, engineering and commerce. Research on this topic has led to the emergence of Grid computing. To achieve the promising potentials of tremendous distributed resources in the Grid, effective and efficient scheduling algorithms are fundamentally important. However, scheduling problems are well known for their intractability, and many of instances are in fact NP-Complete. The situation becomes even more challenging in the Grid circumstances due to some unique characteristics of the Grid. Scheduling algorithms in traditional parallel and distributed systems, which usually run on homogeneous and dedicated resources, cannot work well in the new environments. This work focuses on workflow scheduling algorithms in the Grid scenario. New challenges are discussed, previous research in this realm is surveyed, and novel heuristic algorithms addressing the challenges are proposed and tested. The proposed algorithms contribute to the literature by taking the following factors into account when a schedule for a DAG-based workflow is produced: predictable performance fluctuation and non-deterministic performance model of Grid resources, the computation and data staging co-scheduling, the clustering characteristic of Grid resource distribution, and the ability to reschedule according to performance change after the initial schedule is made. The performance of proposed algorithms are tested and analyzed by simulation under different workflow and resource configurations.
External DOI