Abstract
Previous studies have successfully demonstrated agile and robust locomotion in challenging terrains for quadrupedal robots. However, the bipedal locomotion mode for quadruped robots remains unverified. This paper explores the adaptation of a learning framework originally designed for quadrupedal robots to operate blind locomotion in biped mode. We leverage a framework that incorporates Adversarial Motion Priors with a teacher-student policy to enable imitation of a reference trajectory and navigation on tough terrain. Our work involves transferring and evaluating a similar learning framework on a quadruped robot in biped mode, aiming to achieve stable walking on both flat and complicated terrains. Our simulation results demonstrate that the trained policy enables the quadruped robot to navigate both flat and challenging terrains, including stairs and uneven surfaces.
This work was supported by the Royal Society [grant number RG\(\backslash \)R2\(\backslash \)232409] and the UKRI Future Leaders Fellowship [grant number MR/V025333/1]. For a visual overview of our framework and results, please refer to the video at https://youtu.be/JYD1RlrQRWM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bao, L., et al.: Deep reinforcement learning for bipedal locomotion: a brief survey (2024). arXiv: 2404.17070 [cs.RO]
Ho, J., Ermon, S.: Generative adversarial imitation learning. In: International Conference on Neural Information Processing Systems, pp. 4572–4580 (2016)
Peng, X., et al.: DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Trans. Graph. 36, 1–13 (2017)
Xie, Z., et al.: ALLSTEPS: curriculum-driven learning of stepping stone skills. Comput. Graph. Forum 39, 213–224 (2020)
Siekmann, J., et al.: Sim-to-real learning of all common bipedal gaits via periodic reward composition. In: IEEE International Conference on Robotics and Automation, pp. 7309–7315 (2021)
Xie, Z., et al.: Learning locomotion skills for Cassie: iterative design and sim-to-real. In: Conference on Robot Learning, pp. 317–329 (2020)
Li, Z., et al.: Reinforcement learning for robust parameterized locomotion control of bipedal robots. In: IEEE International Conference on Robotics and Automation, pp. 2811–2817 (2021)
Zhang, Q., et al.: Whole-body humanoid robot locomotion with human reference. arXiv preprint arXiv:2402.18294 (2024)
Wu, J., et al.: Learning robust and agile legged locomotion using adversarial motion priors. IEEE Rob. Autom. Lett. 8(8), 4975–4982 (2023)
Escontrela, A., et al.: Adversarial motion priors make good substitutes for complex reward functions. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 25–32 (2022)
Wang, Y., Jiang, Z., Chen, J.: Learning robust, agile, natural legged locomotion skills in the wild. In: RoboLetics: Workshop on Robot Learning in Athletics@ CoRL (2023)
Winkler, A., et al.: Gait and trajectory optimization for legged systems through phase-based end-effector parameterization. IEEE Rob. Autom. Lett. 3(3), 1560–1567 (2018)
Yu, F., et al.: Dynamic bipedal turning through sim-to-real reinforcement learning. In: IEEE-RAS International Conference on Humanoid Robots, pp. 903–910 (2022)
Duan, H., et al.: Learning task space actions for bipedal locomotion. In: IEEE International Conference on Robotics and Automation, pp. 1276–1282 (2021)
Castillo, G.A., et al.: Template model inspired task space learning for robust bipedal locomotion. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 8582–8589 (2023)
Lee, J., et al.: Learning quadrupedal locomotion over challenging terrain. Sci. Rob. 5(47), eabc5986 (2020)
Kumar, A., et al.: RMA: rapid motor adaptation for legged robots. arXiv preprint arXiv:2107.04034 (2021)
Yu, C., Rosendo, A.: Multi-modal legged locomotion framework with automated residual reinforcement learning. IEEE Rob. Autom. Lett. 7(4), 10312–10319 (2022)
Peng, X.B., et al.: AMP: adversarial motion priors for stylized physics based character control. ACM Trans. Graph. 40(4), 1–20 (2021)
Winkler, A.W.: TOWR-an open-source trajectory optimizer for legged robots in C (2018). https://github.com/ethz-adrl/towr
Schulman, J., et al.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Margolis, G.B., et al.: Rapid locomotion via reinforcement learning. arXiv preprint arXiv:2205.02824 (2022)
Rudin, N., et al.: Learning to walk in minutes using massively parallel deep reinforcement learning. In: Conference on Robot Learning, pp. 91–100. PMLR (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Peng, T., Bao, L., Humphreys, J., Delfaki, A.M., Kanoulas, D., Zhou, C. (2025). Learning Bipedal Walking on a Quadruped Robot via Adversarial Motion Priors. In: Huda, M.N., Wang, M., Kalganova, T. (eds) Towards Autonomous Robotic Systems. TAROS 2024. Lecture Notes in Computer Science(), vol 15052. Springer, Cham. https://doi.org/10.1007/978-3-031-72062-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-72062-8_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72061-1
Online ISBN: 978-3-031-72062-8
eBook Packages: Computer ScienceComputer Science (R0)