Abstract
A Task Graph (TG) is a model of a parallel program that consists of many subtasks that can be executed simultaneously on different processing elements. Subtasks exchange data via an interconnection network. The dependencies between subtasks are described by means of a Directed Acyclic Graph. Unfortunately, due to their characteristics, scheduling a TG requires dedicated or uninterruptible resources. Moreover, scheduling a TG by itself results in a low resource utilization because of the dependencies among the subtasks. Therefore, in order to solve the above problems, we propose a scheduling approach for TGs by using advance reservation in a cluster environment. In addition, to improve resource utilization, we also propose a scheduling solution by interweaving one or more TGs within the same reservation block and/or backfilling with independent jobs.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hoenig, U., Schiffmann, W.: A comprehensive test bench for the evaluation of scheduling heuristics. In: Proc. of the 16th International Conference on Parallel and Distributed Computing and Systems (PDCS 2004), Cambridge, USA (2004)
Coffman, E.G. (ed.): Computer and Job-Shop Scheduling Theory. Wiley, Chichester (1976)
Hoenig, U., Schiffmann, W.: Improving the efficiency of functional parallelism by means of hyper-scheduling. In: Proc. of the the 35th International Conference on Parallel Processing (ICPP), Ohio, USA (in print, 2006)
Sih, G.C., Lee, E.A.: A compile-time scheduling heuristic for interconnection-constrained heterogeneous processor architectures. IEEE Transactions on Parallel and Distributed Systems 4(2), 75–87 (1993)
Hwang, J.J., Chow, Y.C., Anger, F.D., Lee, C.Y.: Scheduling precedence graphs in systems with interprocessor communication times. SIAM Journal on Computing 18(2), 244–257 (1989)
Adam, T.L., Chandy, K.M., Dickson, J.: A comparison of list scheduling for parallel processing systems. Communications of the ACM 17, 685–690 (1974)
Wu, M.Y., Gayski, D.D.: Hypertool: A programming aid for message-passing systems. IEEE Transactions on Parallel and Distributed Systems 1(3), 330–343 (1990)
Tannenbaum, T., Wright, D., Miller, K., Livny, M.: Condor – a distributed job scheduler. In: Sterling, T. (ed.) Beowulf Cluster Computing with Linux. MIT Press, Cambridge (2001)
Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the Condor experience. Concurrency – Practice and Experience 17(2-4), 323–356 (2005)
Berman, F., Chien, A., Cooper, K., Dongarra, J., Foster, I., Gannon, D., Johnsson, L., Kennedy, K., Kesselman, C., Mellor-Crummey, J., Reed, D., Torczon, L., Wolski, R.: The GrADS project: Software support for high-level grid application development. International Journal of High Performance Computing Applications 15(4), 327–344 (2001)
Deelman, E., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Patil, S., Su, M.H., Vahi, K., Livny, M.: Pegasus: Mapping scientific workflow onto the grid. In: Across Grids Conference 2004, Nicosia, Cyprus (2004)
Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M., Wipat, A., Li, P.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17), 3045–3054 (2004)
McGough, S., Young, L., Afzal, A., Newhouse, S., Darlington, J.: Performance architecture within ICENI. UK e-Science All Hands Meeting, 906–911 (2004)
McGough, S., Young, L., Afzal, A., Newhouse, S., Darlington, J.: Workflow enactment in ICENI. UK e-Science All Hands Meeting, 894–900 (2004)
Sinnen, O., Sousa, L.: On task scheduling accuracy: Evaluation methodology and results. Journal of Supercomputing 27(2), 177–194 (2004)
Ahmad, I., Kwok, Y.K.: On exploiting task duplication in parallel program scheduling. IEEE Transactions on Parallel and Distributed Systems 9(8), 872–892 (1998)
Kwok, Y.K., Ahmad, I.: Link contention-constrained scheduling and mapping of tasks and messages to a network of heterogeneous processors. Cluster Computing: The Journal of Networks, Software Tools, and Applications 3(2), 113–124 (2000)
Sulistio, A., Buyya, R.: A grid simulation infrastructure supporting advance reservation. In: Proc. of the 16th International Conference on Parallel and Distributed Computing and Systems (PDCS 2004), Cambridge, USA (2004)
Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the ibm sp2 with backfilling. IEEE Transactions on Parallel and Distributed Systems 12(6), 529–543 (2001)
Feitelson, D.: Parallel workloads archive (2006), http://www.cs.huji.ac.il/labs/parallel/workload
Li, H., Groep, D., Walters, L.: Workload characteristics of a multi-cluster supercomputer. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 176–193. Springer, Heidelberg (2005)
Medernach, E.: Workload analysis of a cluster in a grid environment. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 36–61. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sulistio, A., Schiffmann, W., Buyya, R. (2006). Advanced Reservation-Based Scheduling of Task Graphs on Clusters. In: Robert, Y., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing - HiPC 2006. HiPC 2006. Lecture Notes in Computer Science, vol 4297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11945918_12
Download citation
DOI: https://doi.org/10.1007/11945918_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68039-0
Online ISBN: 978-3-540-68040-6
eBook Packages: Computer ScienceComputer Science (R0)