TY - JOUR
T1 - Self-Supervised Learning for Precise Pick-and-Place Without Object Model
AU - Berscheid, Lars
AU - Meissner, Pascal
AU - Kröger, Torsten
N1 - ACKNOWLEDGEMENT
We would like to thank Tamim Asfour for his helpful suggestions and discussions.
PY - 2020/7/1
Y1 - 2020/7/1
N2 - Flexible pick-and-place is a fundamental yet challenging task within robotics, in particular due to the need of an object model for a simple target pose definition. In this work, the robot instead learns to pick-and-place objects using planar manipulation according to a single, demonstrated goal state. Our primary contribution lies within combining robot learning of primitives, commonly estimated by fully-convolutional neural networks, with one-shot imitation learning. Therefore, we define the place reward as a contrastive loss between real-world measurements and a task-specific noise distribution. Furthermore, we design our system to learn in a self-supervised manner, enabling real-world experiments with up to 25000 pick-and-place actions. Then, our robot is able to place trained objects with an average placement error of 2.7 (0.2) mm and 2.6 (0.8){\deg}. As our approach does not require an object model, the robot is able to generalize to unknown objects while keeping a precision of 5.9 (1.1) mm and 4.1 (1.2){\deg}. We further show a range of emerging behaviors: The robot naturally learns to select the correct object in the presence of multiple object types, precisely inserts objects within a peg game, picks screws out of dense clutter, and infers multiple pick-and-place actions from a single goal state.
AB - Flexible pick-and-place is a fundamental yet challenging task within robotics, in particular due to the need of an object model for a simple target pose definition. In this work, the robot instead learns to pick-and-place objects using planar manipulation according to a single, demonstrated goal state. Our primary contribution lies within combining robot learning of primitives, commonly estimated by fully-convolutional neural networks, with one-shot imitation learning. Therefore, we define the place reward as a contrastive loss between real-world measurements and a task-specific noise distribution. Furthermore, we design our system to learn in a self-supervised manner, enabling real-world experiments with up to 25000 pick-and-place actions. Then, our robot is able to place trained objects with an average placement error of 2.7 (0.2) mm and 2.6 (0.8){\deg}. As our approach does not require an object model, the robot is able to generalize to unknown objects while keeping a precision of 5.9 (1.1) mm and 4.1 (1.2){\deg}. We further show a range of emerging behaviors: The robot naturally learns to select the correct object in the presence of multiple object types, precisely inserts objects within a peg game, picks screws out of dense clutter, and infers multiple pick-and-place actions from a single goal state.
KW - reinforcement learning
KW - Deep learning in grasping and manipulation
KW - imitation learning
UR - http://www.scopus.com/inward/record.url?scp=85089472027&partnerID=8YFLogxK
U2 - 10.1109/LRA.2020.3003865
DO - 10.1109/LRA.2020.3003865
M3 - Letter
VL - 5
SP - 4828
EP - 4835
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
SN - 2377-3766
IS - 3
ER -