Definition
Hierarchical reinforcement learning (HRL) decomposes a reinforcement learningproblem into a hierarchy of subproblems or subtasks such that higher-level parent-tasks invoke lower-level child tasks as if they were primitive actions. A decomposition may have multiple levels of hierarchy. Some or all of the subproblems can themselves be reinforcement learning problems. When a parent-task is formulated as a reinforcement learning problem it is commonly formalized as a semi-Markov decision problem because its actions are child-tasks that persist for an extended period of time. The advantage of hierarchical decomposition is a reduction in computational complexity if the overall problem can be represented more compactly and reusable subtasks learned or provided independently. While the solution to a HRL problem is optimal given the constraints of the hierarchy there are no guarantees in general that the decomposed solution is an optimal solution to the original reinforcement...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Ashby R (1956) Introduction to cybernetics. Chapman & Hall, London
Barto A, Mahadevan S (2003) Recent advances in hiearchical reinforcement learning. Spec Issue Reinf Learn Discret Event Syst J 13:41–77
Dayan P, Hinton GE (1992) Feudal reinforcement learning. In: Advances in neural information processing systems 5 NIPS conference, Denver, 2–5 Dec 1991. Morgan Kaufmann, San Francisco
Dietterich TG (2000) Hierarchical reinforcement learning with the MAXQ value function decomposition. J Artif Intell Res 13:227–303
Digney BL (1998) Learning hierarchical control structures for multiple tasks and changing environments. In: From animals to animats 5: proceedings of the fifth international conference on simulation of adaptive behaviour, SAB 98, Zurich, 17–21 Aug 1998. MIT, Cambridge
Ghavamzadeh M, Mahadevan S (2002) Hierarchically optimal average reward reinforcement learning. In: Sammut C, Achim Hoffmann (eds) Proceedings of the nineteenth international conference on machine learning, Sydney. Morgan-Kaufman, San Francisco, pp 195–202
Hauskrecht M, Meuleau N, Kaelbling LP, Dean T, Boutilier C (1998) Hierarchical solution of Markov decision processes using macro-actions. In: Fourteenth annual conference on uncertainty in artificial intelligence, Madison, pp 220–229
Hengst B (2008) Partial order hierarchical reinforcement learning. In: Australasian conference on artificial intelligence, Auckland, Dec 2008. Springer, Berlin, pp 138–149
Jonsson A, Barto A (2006) Causal graph based decomposition of factored MDPs. J Mach Learn Res 7:2259–2301
Kaelbling LP (1993) Hierarchical learning in stochastic domains: preliminary results. In: Proceedings of the tenth international conference on machine learning. Morgan Kaufmann, San Mateo, pp 167–173
Konidaris G, Barto A (2009) Skill discovery in continuous reinforcement learning domains using skill chaining. In: Bengio Y, Schuurmans D, Lafferty J, Williams CKI, Culotta A (eds) Advances in neural information processing systems 22, Vancouver, pp 1015–1023
McGovern A (2002) Autonomous discovery of abstractions through interaction with an environment. In: SARA. Springer, London, pp 338–339
Moore A, Baird L, Kaelbling LP (1999) Multi-value functions: efficient automatic action hierarchies for multiple goal MDPs. In: Proceedings of the international joint conference on artificial intelligence, Stockholm. Morgan Kaufmann, San Francisco, pp 1316–1323
Parr R, Russell SJ (1997) Reinforcement learning with hierarchies of machines. In: NIPS, Denver
Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming. New York, Wiley
Ryan MRK, Reid MD (2000) Using ILP to improve planning in hierarchical reinforcement learning. In: Proceedings of the tenth international conference on inductive logic programming, ILP 2000, London. Springer, London
Singh S (1992) Reinforcement learning with a hierarchy of abstract models. In: Proceedings of the tenth national conference on artificial intelligence, San Jose
Sutton RS, Precup D, Singh SP (1999) Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif Intell 112(1–2): 181–211
Watkins CJCH (1989) Learning from delayed rewards. PhD thesis, King’s College
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this entry
Cite this entry
Hengst, B. (2017). Hierarchical Reinforcement Learning. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_363
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7687-1_363
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering