Movement Generation and Control Conference Paper 2021

DeepQ Stepper: A framework for reactive dynamic walking on uneven terrain

Reactive stepping and push recovery for biped robots is often restricted to flat terrains because of the difficulty in computing capture regions for nonlinear dynamic models. In this paper, we address this limitation by using reinforcement learning to approximately learn the 3D capture region for such systems. We propose a novel 3D reactive stepper, The DeepQ stepper, that computes optimal step locations for walking at different velocities using the 3D capture regions approximated by the action-value function. We demonstrate the ability of the approach to learn stepping with a simplified 3D pendulum model and a full robot dynamics. Further, the stepper achieves a higher performance when it learns approximate capture regions while taking into account the entire dynamics of the robot that are often ignored in existing reactive steppers based on simplified models. The DeepQ stepper can handle non convex terrain with obstacles, walk on restricted surfaces like stepping stones and recover from external disturbances for a constant computational cost.

Author(s): Avadesh Meduri and Majid Khadiv and Ludovic Righetti
Year: 2021
Month: June
Bibtex Type: Conference Paper (conference)
Event Name: The 2021 International Conference on Robotics and Automation (ICRA 2021)
Event Place: Xi’an China
State: Published
URL: https://arxiv.org/pdf/2010.14834.pdf
Digital: True
Electronic Archiving: grant_archive

BibTex

@conference{meduri2021deepq,
  title = {DeepQ Stepper: A framework for reactive dynamic walking on uneven terrain},
  abstract = {Reactive stepping and push recovery for biped robots is often restricted to flat terrains because of the difficulty in computing capture regions for nonlinear dynamic models. In this paper, we address this limitation by using reinforcement learning to approximately learn the 3D capture region for such systems. We propose a novel 3D reactive stepper, The DeepQ stepper, that computes optimal step locations for walking at different velocities using the 3D capture regions approximated by the action-value function. We demonstrate the ability of the approach to learn stepping with a simplified 3D pendulum model and a full robot dynamics. Further, the stepper achieves a higher performance when it learns approximate capture regions while taking into account the entire dynamics of the robot that are often ignored in existing reactive steppers based on simplified models. The DeepQ stepper can handle non convex terrain with obstacles, walk on restricted surfaces like stepping stones and recover from external disturbances for a constant computational cost.},
  month = jun,
  year = {2021},
  slug = {meduri2021deepq},
  author = {Meduri, Avadesh and Khadiv, Majid and Righetti, Ludovic},
  url = {https://arxiv.org/pdf/2010.14834.pdf},
  month_numeric = {6}
}