Abstract
Markov decision processes provide a formal framework for a computer to make decisions autonomously and intelligently when the effects of its actions are not deterministic. This formalism has had tremendous success in many disciplines; however, its implementation on platforms with scarce computing capabilities and power, as it happens in robotics or autonomous driving, is still limited. To solve this computationally complex problem efficiently under these constraints, high-performance accelerator hardware and parallelized software come to the rescue. In particular, in this work, we evaluate off-line-tuned static and dynamic versus adaptive heterogeneous scheduling strategies for executing value iteration—a core procedure in many decision-making methods, such as reinforcement learning and task planning—on a low-power heterogeneous CPU+GPU SoC that only uses 10–15 W. Our experimental results show that by using CPU+GPU heterogeneous strategies, the computation time and energy required are considerably reduced. They can be up to 54% (61%) faster and 57% (65%) more energy-efficient with respect to multicore—TBB—(or GPU-only—OpenCL—) implementation. Additionally, we also explore the impact of increasing the abstraction level of the programming model to ease the programming effort. To that end, we compare the TBB+OpenCL vs. the TBB+oneAPI implementations of our heterogeneous schedulers, observing that oneAPI versions result in up to \(5\times\) less programming effort and only incur in 3–8% of overhead if the scheduling strategy is selected carefully.









Similar content being viewed by others
References
Barber R, Crespo J, Gomez C, Hernamdez A, Galli M (2019) Mobile robot navigation in indoor environments: geometric, topological, and semantic navigation, chapter 5. Intech Open, London, pp 393–640
Bellman R (1954) The theory of dynamic programming. Bull Am Math Soc 60(6):503–515
Bertsekas DP (2007) Dynamic programming and optimal control, vol 2, 3rd edn. Athena Scientific, Nashua
Boucherie RJ, van Dijk NM (eds) (2017) Markov decision processes in practice. Springer
Constantinescu DA (2017) Optimization of a decision making algorithm under uncertainty for heterogeneous platforms. Master’s thesis, Universidad de Málaga. https://doi.org/10.13140/RG.2.2.24922.70082
Coradeschi S et al (2014) GiraffPlus: a system for monitoring activities and physiological parameters and promoting social interaction for elderly. In: Hippe ZS, Kulikowski JL, Mroczek T, Wtorek J (eds) Human–Computer Systems Interaction: Backgrounds and Applications 3. Springer, New York
Corbera F, Rodríguez A, Asenjo R, Navarro A, Vilches A, Garzarán MJ (2015) Reducing overheads of dynamic scheduling on heterogeneous chips. arXiv preprint arXiv:1501.03336
Dios AJ, Asenjo R, Navarro AG, Corbera F, Zapata EL (2011) High-level template for the task-based parallel wavefront pattern. In: 18th International Conference on High Performance Computing
Fernández-Madrigal JA, Cruz-Martin AM, Aguilar-Moreno M, Vega IF (2019) CRUMB: cognitive-robotics-supporting mobile base (consulted 1st of August, 2019). http://babel.isa.uma.es/crumb
Gordon GJ (1999) Approximate solutions to markov decision processes. Ph.D. thesis, Carnegie Mellon University Pittsburgh. http://reports-archive.adm.cs.cmu.edu/anon/1999/CMU-CS-99-143.pdf
Group K (2019) SYCL specification: SYCL integrates OpenCL devices with modern C++, v1.2.1
Hernandez B, Pérez H, Rudomin I, Ruiz S, de Gyves O, Toledo L (2014) Simulating and visualizing real-time crowds on GPU clusters. Comput Sist 18(4):651–664
Iannucci S, Chen Q, Abdelwahed S (2016) High-performance intrusion response planning on many-core architectures. In: International Conference on Computer Communication and Networks (ICCCN). IEEE, pp 1–6
Intel: Intel oneAPI Programming Guide (Beta) (2019)
Jaskowski W (2017) Mastering 2048 with delayed temporal coherence learning, multi-stage weight promotion, redundant encoding and carousel shaping. In: IEEE Transactions on Computational Intelligence and AI in Games
Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp 1928–1937
Munir A, Gordon-Ross A, Ranka S (2015) Modeling and optimization of parallel and distributed embedded systems. Wiley, New York
Navarro A, Corbera F, Rodriguez A, Vilches A, Asenjo R (2019) Heterogeneous parallel_for template for CPU-GPU chips. Int J Parallel Program 47(2):213–233
Powell WB (2011) Approximate dynamic programming: solving the curses of dimensionality, 2nd edn. Wiley, New York
Puterman ML (2005) Markov decision processes: discrete stochastic dynamic programming (Wiley series in probability and statistics). Wiley, New York
Robotics C (2019) V-REP: virtual robot experimentation platform (consulted 1st of August, 2019). www.coppeliarobotics.com
Rodríguez A, Navarro A, Asenjo R, Corbera F, Gran R, Suárez D, Nunez-Yanez J (2019) Parallel multiprocessing and scheduling on the heterogeneous Xeon+FPGA platform. J Supercomput. https://doi.org/10.1007/s11227-019-02935-1
Ruiz S, Hernández B (2015) A parallel solver for Markov decision process in crowd simulations. In: 2015 Fourteenth Mexican International Conference on Artificial Intelligence (MICAI). IEEE, pp 107–116
Sigaud O, Buffet O (2013) Markov decision processes in artificial intelligence. Wiley, New York
Tai L, Liu M (2016) Mobile robots exploration through CNN-based reinforcement learning. Robot Biomim 3(1):24
Thakur A, Svec P, Gupta SK (2012) GPU based generation of state transition models using simulations for unmanned surface vehicle trajectory planning. Robot Auton Syst 60(12):1457–1471
Vega IF (2016) Development of a programming environment for a simulated TurtleBot-2 robot with a WindowsX manipulator arm through the connection of V-REP and MATLAB. B.Sc. thesis, University of Málaga
Voss M, Asenjo R, Reinders J (2019) Pro TBB: C++ parallel programming with threading building blocks. Apress, New York
White D (1993) Markov decision processes. Wiley, New York
Wiering M, Otterlo M (eds) (2012) Reinforcement learning: state-of-the-art. Springer, New York
Willhalm T, Dementiev R, Fay P (2020) Performance counter monitor (PCM) (consulted 21st of January, 2020). https://github.com/opcm/pcm
Wu Z (2017) Parallelizing model checking algorithms using multi-core and many-core architectures. Ph.D. thesis, Nanyang Technological University, Singapore
Yamaguchi U, Saito F, Ikeda K, Yamamoto T (2015) HSR, human support robot as research and development platform. In: International Conference on Advanced Mechatronics: Toward Evolutionary Fusion of IT and Mechatronics, pp 39–40
Zhou H, Khatri SP, Hu J, Liu F, Sze C (2017) Fast and highly scalable Bayesian MDP on a GPU platform. In: International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp 158–167
Acknowledgements
This work is a result of the research project TIN2016-80920-R, funded by the Spanish Government. It has also been supported by Junta de Andalucía under research projects UMA18-FEDERJA-108, UMA18-FEDERJA-113, and TEP-2279.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Constantinescu, DA., Navarro, A., Corbera, F. et al. Efficiency and productivity for decision making on low-power heterogeneous CPU+GPU SoCs. J Supercomput 77, 44–65 (2021). https://doi.org/10.1007/s11227-020-03257-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-020-03257-3