Abstract
In a warehouse environment, tasks appear dynamically. Consequently, a task management system that matches them with the workforce too early (e.g., weeks in advance) is necessarily sub-optimal. Also, the rapidly increasing size of the action space of such a system consists of a significant problem for traditional schedulers. Reinforcement learning, however, is suited to deal with issues requiring making sequential decisions towards a long-term, often remote, goal. In this work, we set ourselves on a problem that presents itself with a hierarchical structure: the task-scheduling, by a centralised agent, in a dynamic warehouse multi-agent environment and the execution of one such schedule, by decentralised agents with only partial observability thereof. We propose to use deep reinforcement learning to solve both the high-level scheduling problem and the low-level multi-agent problem of schedule execution. The topic and contribution is relevant to both reinforcement learning and operations research scientific communities and is directed towards future real-world industrial applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahilan, S., Dayan, P.: Feudal multi-agent hierarchies for cooperative reinforcement learning. arXiv preprint arXiv:1901.08492 (2019)
Azar, Y.: On-line load balancing. In: Fiat, A., Woeginger, G.J. (eds.) Online Algorithms. LNCS, vol. 1442, pp. 178–195. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0029569
Bacon, P.L., Harb, J., Precup, D.: The option-critic architecture. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
Chevalier-Boisvert, M., Willems, L., Pal, S.: Minimalistic gridworld environment for openai gym (2018). https://github.com/maximecb/gym-minigrid
Christianos, F., Papoudakis, G., Rahman, M.A., Albrecht, S.V.: Scaling multi-agent reinforcement learning with selective parameter sharing. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, 18–24 July 2021, vol. 139, pp. 1989–1998. PMLR (2021). https://proceedings.mlr.press/v139/christianos21a.html
Claes, D., Oliehoek, F., Baier, H., Tuyls, K., et al.: Decentralised online planning for multi-robot warehouse commissioning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 492–500 (2017)
Dayan, P., Hinton, G.E.: Feudal reinforcement learning. In: Hanson, S., Cowan, J., Giles, C. (eds.) Advances in Neural Information Processing Systems, vol. 5. Morgan-Kaufmann (1993)
Dietterich, T.G.: Hierarchical reinforcement learning with the maxq value function decomposition. J. Artif. Intell. Res. 13, 227–303 (2000)
Fickinger, A.: Multi-agent gridworld environment for openai gym (2020). https://github.com/ArnaudFickinger/gym-multigrid
Fluri, C., Ruch, C., Zilly, J., Hakenberg, J., Frazzoli, E.: Learning to operate a fleet of cars. In: IEEE Intelligent Transportation Systems Conference (ITSC) (2019)
Foerster, J., Assael, I.A., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29. Curran Associates, Inc. (2016)
Gammelli, D., Yang, K., Harrison, J., Rodrigues, F., Pereira, F.C., Pavone, M.: Graph neural network reinforcement learning for autonomous mobility-on-demand systems. arXiv preprint arXiv:2104.11434 (2021)
Guériau, M., Dusparic, I.: Samod: Shared autonomous mobility-on-demand using decentralized reinforcement learning. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE (2018)
Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: Sukthankar, G., Rodriguez-Aguilar, J.A. (eds.) AAMAS 2017. LNCS (LNAI), vol. 10642, pp. 66–83. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71682-4_5
Holler, J., et al.: Deep reinforcement learning for multi-driver vehicle dispatching and repositioning problem. In: 2019 IEEE International Conference on Data Mining (ICDM), pp. 1090–1095. IEEE (2019)
Hu, Y., Yao, Y., Lee, W.S.: A reinforcement learning approach for optimizing multiple traveling salesman problems over graphs. Knowl.-Based Syst. 204, 106244 (2020)
Kaempfer, Y., Wolf, L.: Learning the multiple traveling salesmen problem with permutation invariant pooling networks. arXiv preprint arXiv:1803.09621 (2018)
Kong, X., Xin, B., Liu, F., Wang, Y.: Revisiting the master-slave architecture in multi-agent deep reinforcement learning. arXiv preprint arXiv:1712.07305 (2017)
Kraemer, L., Banerjee, B.: Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing 190, 82–94 (2016)
Lei, Z., Qian, X., Ukkusuri, S.V.: Efficient proactive vehicle relocation for on-demand mobility service with recurrent neural networks. Transp. Res. Part C: Emerg. Technol. 117, 102678 (2020)
Liang, E., et al.: RLlib: abstractions for distributed reinforcement learning. In: International Conference on Machine Learning (ICML) (2018)
Lin, K., Zhao, R., Xu, Z., Zhou, J.: Efficient large-scale fleet management via multi-agent deep reinforcement learning. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018)
Liu, N., et al.: A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp. 372–382. IEEE (2017)
Makar, R., Mahadevan, S., Ghavamzadeh, M.: Hierarchical multi-agent reinforcement learning. In: Proceedings of the Fifth International Conference on Autonomous Agents, pp. 246–253 (2001)
Mao, H., Alizadeh, M., Menache, I., Kandula, S.: Resource management with deep reinforcement learning. In: Proceedings of the 15th ACM Workshop on Hot Topics in Networks, pp. 50–56 (2016)
Mao, H., Schwarzkopf, M., Venkatakrishnan, S.B., Meng, Z., Alizadeh, M.: Learning scheduling algorithms for data processing clusters. In: Proceedings of the ACM Special Interest Group on Data Communication, pp. 270–288 (2019)
Ming, G.F., Hua, S.: Course-scheduling algorithm of option-based hierarchical reinforcement learning. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 1, pp. 288–291. IEEE (2010)
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Nachum, O., Gu, S.S., Lee, H., Levine, S.: Data-efficient hierarchical reinforcement learning. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018)
Papoudakis, G., Christianos, F., Rahman, A., Albrecht, S.V.: Dealing with non-stationarity in multi-agent deep reinforcement learning. arXiv preprint arXiv:1906.04737 (2019)
Parr, R., Russell, S.: Reinforcement learning with hierarchies of machines. Advances in Neural Information Processing Systems, pp. 1043–1049 (1998)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Sutton, R.S., Precup, D., Singh, S.: Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999)
Tang, H., et al.: Hierarchical deep multiagent reinforcement learning with temporal abstraction. arXiv preprint arXiv:1809.09332 (2018)
Terry, J.K., Grammel, N., Hari, A., Santos, L., Black, B.: Revisiting parameter sharing in multi-agent deep reinforcement learning. arXiv preprint arXiv:2005.13625 (2020)
Ye, H., Li, G.Y.: Deep reinforcement learning for resource allocation in v2v communications. In: 2018 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2018)
Acknowledgements
This work was supported by Fundação para a Ciência e a Tecnologia under project UIDB/50021/2020 and scholarship 2020.05360.BD.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Carvalho, D., Sengupta, B. (2022). Hierarchically Structured Scheduling and Execution of Tasks in a Multi-agent Environment. In: Marreiros, G., Martins, B., Paiva, A., Ribeiro, B., Sardinha, A. (eds) Progress in Artificial Intelligence. EPIA 2022. Lecture Notes in Computer Science(), vol 13566. Springer, Cham. https://doi.org/10.1007/978-3-031-16474-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-16474-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16473-6
Online ISBN: 978-3-031-16474-3
eBook Packages: Computer ScienceComputer Science (R0)