Improvisation through Physical Understanding: Using Novel Objects as Tools with Visual Foresight
April 11, 2019 ยท Declared Dead ยท ๐ Robotics: Science and Systems
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Annie Xie, Frederik Ebert, Sergey Levine, Chelsea Finn
arXiv ID
1904.05538
Category
cs.RO: Robotics
Cross-listed
cs.AI,
cs.LG
Citations
90
Venue
Robotics: Science and Systems
Last Checked
3 months ago
Abstract
Machine learning techniques have enabled robots to learn narrow, yet complex tasks and also perform broad, yet simple skills with a wide variety of objects. However, learning a model that can both perform complex tasks and generalize to previously unseen objects and goals remains a significant challenge. We study this challenge in the context of "improvisational" tool use: a robot is presented with novel objects and a user-specified goal (e.g., sweep some clutter into the dustpan), and must figure out, using only raw image observations, how to accomplish the goal using the available objects as tools. We approach this problem by training a model with both a visual and physical understanding of multi-object interactions, and develop a sampling-based optimizer that can leverage these interactions to accomplish tasks. We do so by combining diverse demonstration data with self-supervised interaction data, aiming to leverage the interaction data to build generalizable models and the demonstration data to guide the model-based RL planner to solve complex tasks. Our experiments show that our approach can solve a variety of complex tool use tasks from raw pixel inputs, outperforming both imitation learning and self-supervised learning individually. Furthermore, we show that the robot can perceive and use novel objects as tools, including objects that are not conventional tools, while also choosing dynamically to use or not use tools depending on whether or not they are required.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Robotics
๐
๐
Old Age
R.I.P.
๐ป
Ghosted
ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras
R.I.P.
๐ป
Ghosted
VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator
R.I.P.
๐ป
Ghosted
ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM
R.I.P.
๐ป
Ghosted
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World
R.I.P.
๐ป
Ghosted
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
๐ป
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
๐ป
Ghosted