Algorithms for tree-shaped task partition and allocation on heterogeneous multiprocessors

He, Suna; Wu, Jigang; Wei, Bing; Wu, Jiaxin

doi:10.1007/s11227-023-05186-3

Algorithms for tree-shaped task partition and allocation on heterogeneous multiprocessors

Published: 22 March 2023

Volume 79, pages 13210–13240, (2023)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

263 Accesses
2 Citations
Explore all metrics

Abstract

Application with a set of dependent distributed tasks is generally regarded as a direct acyclic graph or an out-tree. Tree-shaped task graphs are widely applied in a variety of computational domains, including electronic structure computations and sparse matrix factorization. Efficient algorithms for tree-shaped task partition and allocation can dominate the performance of heterogeneous computing systems, as most relevant publications have pointed out. This paper presents efficient algorithms for partitioning and allocating tree-shaped tasks on heterogeneous multiprocessor systems with limited memory to improve task-parallel computing. The proposed main algorithm consists of two stages: partition and allocation. During partition, an algorithm is provided for partitioning a task tree into multiple subtrees. It iteratively partitions the subtrees on the critical path of the quotient tree. During allocation, two algorithms are proposed for task allocation to minimize the task tree’s execution time. One is to preferentially allocate the largest subtree of the whole tree, and the other is to preferentially allocate the subtree located on the quotient tree’s critical path. Experimental results show that the proposed algorithms significantly improve the latest works in terms of average makespan, both on randomly generated trees and on a real-world dataset. On a real-world dataset, the average makespan of existing work is approximately $6.28\times 10^8$. However, it is approximately $2.13\times 10^8$ for our proposed algorithm. This results in a reduction of 64.33%. On randomly generated trees, the average makespan of existing work is approximately 8973. However, it is approximately 4040 for our proposed algorithm. This results in a reduction of 54.96%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Task Scheduling in a Multiprocessor System

Scheduling of Parallel Tasks with Proportionate Priorities

Article 21 May 2016

A hybrid list-based task scheduling scheme for heterogeneous computing

Article 02 March 2021

Data availability

The assembly trees dataset are generated by a set of sparse matrices, which can be obtained from the University of Florida Sparse Matrix Collection http://www.cise.ufl.edu/research/sparse/matrices/.

References

Hussain H, Malik SUR, Hameed A, Khan SU, Bickler G, Min-Allah N, Qureshi MB, Zhang L, Yongji W, Ghani N et al (2013) A survey on resource allocation in high performance distributed computing systems. Parall Comput 39(11):709–736
Article MathSciNet Google Scholar
Kelefouras V, Djemame K (2022) Workflow simulation and multi-threading aware task scheduling for heterogeneous computing. J Parall Distrib Comput 168:17–32
Article Google Scholar
Davis TA (2006) Direct methods for sparse linear systems. Society for Industrial and Applied Mathematics, Texas
Book MATH Google Scholar
Kim K, Eijkhout V (2014) A parallel sparse direct solver via hierarchical DAG scheduling. ACM Trans Math Softw 41(1):1–27
Article MathSciNet MATH Google Scholar
Sao P, Li XS, Vuduc R (2018) A communication-avoiding 3D LU factorization algorithm for sparse matrices. In: IEEE International parallel and distributed processing symposium, pp. 908–919
Gou C, Benoit A, Marchal L (2020) Partitioning tree-shaped task graphs for distributed platforms with limited memory. IEEE Trans Parall Distrib Syst 31(7):1533–1544
Article Google Scholar
Ozkaya MY, Benoit A, Ucar B, Herrmann J, Catalyurek UV (2019) A scalable clustering-based task scheduler for homogeneous processors using DAG partitioning. In: IEEE International Parallel and Distributed Processing Symposium, pp. 155–165
Meyerhenke H, Sanders P, Schulz C (2017) Parallel graph partitioning for complex networks. IEEE Trans Parall Distrib Syst 28(9):2625–2638
Article Google Scholar
Zhou AC, Shen B, Xiao Y, Ibrahim S, He B (2019) Cost-aware partitioning for efficient large graph processing in geo-distributed datacenters. IEEE Trans Parall Distrib Syst 31(7):1707–1723
Article Google Scholar
Jacquelin M, Marchal L, Robert Y, Ucar B (2011) On optimal tree traversals for sparse matrix factorization. In: IEEE international parallel & distributed processing symposium, pp. 556–567
Djigal H, Feng J, Lu J, Ge J (2021) IPPTS: an efficient algorithm for scientific workflow scheduling in heterogeneous computing systems. IEEE Trans Parall Distrib Syst 32(5):1057–1071
Article Google Scholar
Zhou N, Qi D, Wang X, Zheng Z, Lin W (2017) A list scheduling algorithm for heterogeneous systems based on a critical node cost table and pessimistic cost table. Concurr Comput Pract Exp 29(5):e3944
Article Google Scholar
Wu C-G, Wang L, Wang J-J (2021) A path relinking enhanced estimation of distribution algorithm for direct acyclic graph task scheduling problem. Knowl Syst 228:1–15
Google Scholar
Wang H, Sinnen O (2018) List-scheduling versus cluster-scheduling. IEEE Trans Parall Distrib Syst 29(8):1736–1749
Article Google Scholar
Yoosefi A, Naji HR (2017) A clustering algorithm for communicationaware scheduling of task graphs on multi-core reconfigurable systems. IEEE Trans Parall Distrib Syst 28(10):2718–2732
Article Google Scholar
Sinnen O, To A, Kaur M (2011) Contention-aware scheduling with task duplication. J Parall Distrib Comput 71(1):77–86
Article Google Scholar
He K, Meng X, Pan Z, Yuan L, Zhou P (2018) A novel task-duplication based clustering algorithm for heterogeneous computing environments. IEEE Trans Parall Distrib Syst 30(1):2–14
Article Google Scholar
Ramezani R (2021) Dynamic scheduling of task graphs in multi-fpga systems using critical path. J Supercomput 77(1):597–618
Article Google Scholar
Marchal L, Nagy H, Simon B, Vivien F (2018) Parallel scheduling of DAGs under memory constraints. In: IEEE international parallel and distributed processing symposium, pp. 204–213
Kitagawa Y, Ishigooka T, Azumi T (2018) Dag scheduling algorithm for a cluster-based many-core architecture. In: IEEE International Conference on Embedded And Ubiquitous Computing, pp. 150–157
Geng X, Mao Y, Xiong M, Liu Y (2019) An improved task scheduling algorithm for scientific workflow in cloud computing environment. Cluster Comput 22(3):7539–7548
Article Google Scholar
Tang X, Shi W, Wu F (2019) Interconnection network energy-aware workflow scheduling algorithm on heterogeneous systems. IEEE Trans Indust Inform 16:7637–7645
Article Google Scholar
Guermouche A, Marchal L, Simon B, Vivien F (2015) Scheduling trees of malleable tasks for sparse linear algebra. In: European Conference on Parallel Processing, pp. 479–490
Eyraud-Dubois L, Marchal L, Sinnen O, Vivien F (2015) Parallel scheduling of task trees with limited memory. ACM Trans Parall Comput 2(2):1–37
Article Google Scholar
Rennich SC, Stosic D, Davis TA (2016) Accelerating sparse cholesky factorization on GPUs. Parall Comput 59:140–150
Article MathSciNet Google Scholar
Kayaaslan E, Lambert T, Marchal L, Ucar B (2018) Scheduling series-parallel task graphs to minimize peak memory. Theor Comput Sci 707:1–23
Article MathSciNet MATH Google Scholar
Gou C, Benoit A, Marchal L (2018) Memory-aware tree partitioning on homogeneous platforms. In: Euromicro International Conference on Parallel, Distributed and Network-Based Processing, pp. 321–324
Aupy G, Brasseur C, Marchal L (2017) Dynamic memory-aware task-tree scheduling. In: IEEE international parallel and distributed processing symposium, pp. 758–767
Guinand F, Moukrim A, Sanlaville E (2004) Sensitivity analysis of tree scheduling on two machines with communication delays. Parall Comput 30(1):103–120
Article MathSciNet Google Scholar
Bai H, Zhang X, Liu Y, Xie Y (2021) Resource scheduling based on routing tree and detection matrix for internet of things. Int J Distrib Sens Netw 17:1–13
Article Google Scholar
Herrmann J, Marchal L, Robert Y (2014) Memory-aware list scheduling for hybrid platforms. In: IEEE international parallel & distributed processing symposium workshops, pp. 689–698
Bak S, Hernandez O, Gates M, Luszczek P, Sarkar V (2021) Task-graph scheduling extensions for efficient synchronization and communication. In: Proceedings of the Acm International Conference on Supercomputing, pp. 88–101
Herrmann J, Marchal L, Robert Y (2013) Model and complexity results for tree traversals on hybrid platforms. In: EUROPEAN CONFERENCE ON PARALLEL PROCESSING, pp. 647–658
Arabnejad H, Barbosa JG (2014) List scheduling algorithm for heterogeneous systems by an optimistic cost table. IEEE Trans Parall Distrib Syst 25(3):682–694
Article Google Scholar
Taheri G, Khonsari A, Entezari-Maleki R, Sousa L (2020) A hybrid algorithm for task scheduling on heterogeneous multiprocessor embedded systems. Appl Soft Comput 91:1–14
Article Google Scholar
Jeong D, Kim J, Oldja M-L, Ha S (2021) Parallel scheduling of multiple sdf graphs onto heterogeneous processors. IEEE Access 9:20493–20507
Article Google Scholar
Li J, Zheng G, Zhang H, Shi G (2019) Task scheduling algorithm for heterogeneous real-time systems based on deadline constraints. In: IEEE International Conference on Electronics Information And Emergency Communication, pp. 113–116
He S, Wu J, Wei B, Wu J (2021) Task tree partition and subtree allocation for heterogeneous multiprocessors. In: IEEE International Conference on Parallel Distributed Processing With Applications, Big Data Cloud Computing, Sustainable Computing Communications, Social Computing Networking, pp. 571–577

Download references

Acknowledgements

Part of the work has been presented in 2021 IEEE International Conference on Parallel & Distributed Processing with Applications, Sept. 30 – Oct. 3, 2021, New York, USA.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62072118 and 62202108. It was also supported in part by Huangpu International Sci & Tech Cooperation Foundation of Guangzhou, China under Grant No. 2021GH12, Guangdong Natural Science Foundation under Grant Nos. 2023A1515011230, 2023A1515030183 and 2021B1515120010.

Author information

Authors and Affiliations

School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006, China
Suna He, Jigang Wu, Bing Wei & Jiaxin Wu

Authors

Suna He
View author publications
You can also search for this author in PubMed Google Scholar
Jigang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Bing Wei
View author publications
You can also search for this author in PubMed Google Scholar
Jiaxin Wu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.H., J.W. and B.W. conceived of the presented idea. J.W. encouraged S.H. to investigate the critical path and supervised the findings of this work. S.H. carried out the experiment and wrote the main manuscript text. All authors discussed the results and revised the manuscript.

Corresponding author

Correspondence to Jigang Wu.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

He, S., Wu, J., Wei, B. et al. Algorithms for tree-shaped task partition and allocation on heterogeneous multiprocessors. J Supercomput 79, 13210–13240 (2023). https://doi.org/10.1007/s11227-023-05186-3

Download citation

Accepted: 09 March 2023
Published: 22 March 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s11227-023-05186-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithms for tree-shaped task partition and allocation on heterogeneous multiprocessors

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Task Scheduling in a Multiprocessor System

Scheduling of Parallel Tasks with Proportionate Priorities

A hybrid list-based task scheduling scheme for heterogeneous computing

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Algorithms for tree-shaped task partition and allocation on heterogeneous multiprocessors

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Task Scheduling in a Multiprocessor System

Scheduling of Parallel Tasks with Proportionate Priorities

A hybrid list-based task scheduling scheme for heterogeneous computing

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.