TY - JOUR
T1 - Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation
AU - Dai, Tianhong
AU - Arulkumaran, Kai
AU - Gerbert, Tamara
AU - Tukra, Samyakh
AU - Behbahani, Feryal
AU - Bharath, Anil Anthony
PY - 2022/7/7
Y1 - 2022/7/7
N2 - Deep reinforcement learning (DRL) has the potential to train robots to perform complex tasks in the real world without requiring accurate models of the robot or its environment. However, agents trained with these algorithms typically lack the explainability of more traditional control methods. In this work, we use a combination of out-of-distribution generalisation tests and post hoc interpretability methods in order to understand what strategies DRL-trained agents use to perform a reaching task. To do so, we train agents under different conditions, using comparison to better interpret both quantitative and qualitative results; this allows us to not only provide local explanations, but also broad categorisations of behaviour. A key aim of our work is to understand how agents trained with visual domain randomisation (DR)—a technique which allows agents to generalise from simulation-based-training to the real world—differ from agents trained without. Our results show that the primary outcome of DR is more robust, entangled representations, accompanied by greater spatial structure in convolutional filters. Furthermore, even with an improved saliency method introduced in this work, we show that qualitative studies may not always correspond with quantitative measures, necessitating the combination of inspection tools in order to provide sufficient insights into the behaviour of trained agents. We conclude with recommendations for applying interpretability methods to DRL agents.
AB - Deep reinforcement learning (DRL) has the potential to train robots to perform complex tasks in the real world without requiring accurate models of the robot or its environment. However, agents trained with these algorithms typically lack the explainability of more traditional control methods. In this work, we use a combination of out-of-distribution generalisation tests and post hoc interpretability methods in order to understand what strategies DRL-trained agents use to perform a reaching task. To do so, we train agents under different conditions, using comparison to better interpret both quantitative and qualitative results; this allows us to not only provide local explanations, but also broad categorisations of behaviour. A key aim of our work is to understand how agents trained with visual domain randomisation (DR)—a technique which allows agents to generalise from simulation-based-training to the real world—differ from agents trained without. Our results show that the primary outcome of DR is more robust, entangled representations, accompanied by greater spatial structure in convolutional filters. Furthermore, even with an improved saliency method introduced in this work, we show that qualitative studies may not always correspond with quantitative measures, necessitating the combination of inspection tools in order to provide sufficient insights into the behaviour of trained agents. We conclude with recommendations for applying interpretability methods to DRL agents.
KW - Deep reinforcement learning
KW - Generalisation
KW - Interpretability
KW - Saliency
UR - http://dx.doi.org/10.1016/j.neucom.2022.04.005
U2 - 10.1016/j.neucom.2022.04.005
DO - 10.1016/j.neucom.2022.04.005
M3 - Article
SN - 0925-2312
VL - 493
SP - 143
EP - 165
JO - Neurocomputing
JF - Neurocomputing
ER -