Learning Spring Mass Locomotion: Guiding Policies with a Reduced-Order Model

October 21, 2020 · Entered Twilight · 🏛 IEEE Robotics and Automation Letters

"Last commit was 5.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, AslipTestTSMatching.py, AslipTestVaryVel.py, LICENSE, RL_controller_aslip.py, RL_controller_aslip_keyboard.py, RL_controller_aslip_nodelta.py, RL_controller_aslip_radio_only.py, RL_controller_aslip_single_speed.py, RL_controller_aslip_trajinput.py, RL_controller_aslip_trajinput_TS_log_switch.py, _RL_controller_aslip_nodelta.py, apex-logo.png, apex.py, cassie, cassie_top_white.png, deprecated, eval_perturb.py, hardware_logs, plotData.py, plots, post_TS_test_log.py, post_VaryVel_test_log.py, post_process_log.py, post_process_visualize.py, readme.md, renderpol.py, rl, setup.py, testTS_logs, testVaryVel_logs, test_reference_traj.py, tools, trained_models, vis_perturb.py

Authors Kevin Green, Yesh Godse, Jeremy Dao, Ross L. Hatton, Alan Fern, Jonathan Hurst arXiv ID 2010.11234 Category cs.RO: Robotics Citations 59 Venue IEEE Robotics and Automation Letters Repository https://github.com/osudrl/ASLIP-RL ⭐ 8 Last Checked 1 month ago

Abstract

In this paper, we describe an approach to achieve dynamic legged locomotion on physical robots which combines existing methods for control with reinforcement learning. Specifically, our goal is a control hierarchy in which highest-level behaviors are planned through reduced-order models, which describe the fundamental physics of legged locomotion, and lower level controllers utilize a learned policy that can bridge the gap between the idealized, simple model and the complex, full order robot. The high-level planner can use a model of the environment and be task specific, while the low-level learned controller can execute a wide range of motions so that it applies to many different tasks. In this letter we describe this learned dynamic walking controller and show that a range of walking motions from reduced-order models can be used as the command and primary training signal for learned policies. The resulting policies do not attempt to naively track the motion (as a traditional trajectory tracking controller would) but instead balance immediate motion tracking with long term stability. The resulting controller is demonstrated on a human scale, unconstrained, untethered bipedal robot at speeds up to 1.2 m/s. This letter builds the foundation of a generic, dynamic learned walking controller that can be applied to many different tasks.