Learning Spring Mass Locomotion: Guiding Policies with a Reduced-Order Model

October 21, 2020 ยท Entered Twilight ยท ๐Ÿ› IEEE Robotics and Automation Letters

๐ŸŒ… TWILIGHT: Old Age
Predates the code-sharing era โ€” a pioneer of its time

"Last commit was 5.0 years ago (โ‰ฅ5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, AslipTestTSMatching.py, AslipTestVaryVel.py, LICENSE, RL_controller_aslip.py, RL_controller_aslip_keyboard.py, RL_controller_aslip_nodelta.py, RL_controller_aslip_radio_only.py, RL_controller_aslip_single_speed.py, RL_controller_aslip_trajinput.py, RL_controller_aslip_trajinput_TS_log_switch.py, _RL_controller_aslip_nodelta.py, apex-logo.png, apex.py, cassie, cassie_top_white.png, deprecated, eval_perturb.py, hardware_logs, plotData.py, plots, post_TS_test_log.py, post_VaryVel_test_log.py, post_process_log.py, post_process_visualize.py, readme.md, renderpol.py, rl, setup.py, testTS_logs, testVaryVel_logs, test_reference_traj.py, tools, trained_models, vis_perturb.py

Authors Kevin Green, Yesh Godse, Jeremy Dao, Ross L. Hatton, Alan Fern, Jonathan Hurst arXiv ID 2010.11234 Category cs.RO: Robotics Citations 59 Venue IEEE Robotics and Automation Letters Repository https://github.com/osudrl/ASLIP-RL โญ 8 Last Checked 1 month ago
Abstract
In this paper, we describe an approach to achieve dynamic legged locomotion on physical robots which combines existing methods for control with reinforcement learning. Specifically, our goal is a control hierarchy in which highest-level behaviors are planned through reduced-order models, which describe the fundamental physics of legged locomotion, and lower level controllers utilize a learned policy that can bridge the gap between the idealized, simple model and the complex, full order robot. The high-level planner can use a model of the environment and be task specific, while the low-level learned controller can execute a wide range of motions so that it applies to many different tasks. In this letter we describe this learned dynamic walking controller and show that a range of walking motions from reduced-order models can be used as the command and primary training signal for learned policies. The resulting policies do not attempt to naively track the motion (as a traditional trajectory tracking controller would) but instead balance immediate motion tracking with long term stability. The resulting controller is demonstrated on a human scale, unconstrained, untethered bipedal robot at speeds up to 1.2 m/s. This letter builds the foundation of a generic, dynamic learned walking controller that can be applied to many different tasks.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Robotics