Achieving efficient interpretability of reinforcement learning via policy distillation and selective input gradient regularization. (April 2023)
- Record Type:
- Journal Article
- Title:
- Achieving efficient interpretability of reinforcement learning via policy distillation and selective input gradient regularization. (April 2023)
- Main Title:
- Achieving efficient interpretability of reinforcement learning via policy distillation and selective input gradient regularization
- Authors:
- Xing, Jinwei
Nagata, Takashi
Zou, Xinyun
Neftci, Emre
Krichmar, Jeffrey L. - Abstract:
- Abstract: Although deep Reinforcement Learning (RL) has proven successful in a wide range of tasks, one challenge it faces is interpretability when applied to real-world problems. Saliency maps are frequently used to provide interpretability for deep neural networks. However, in the RL domain, existing saliency map approaches are either computationally expensive and thus cannot satisfy the real-time requirement of real-world scenarios or cannot produce interpretable saliency maps for RL policies. In this work, we propose an approach of Distillation with selective Input Gradient Regularization (DIGR) which uses policy distillation and input gradient regularization to produce new policies that achieve both high interpretability and computation efficiency in generating saliency maps. Our approach is also found to improve the robustness of RL policies to multiple adversarial attacks. We conduct experiments on three tasks, MiniGrid (Fetch Object), Atari (Breakout) and CARLA Autonomous Driving, to demonstrate the importance and effectiveness of our approach.
- Is Part Of:
- Neural networks. Volume 161(2023)
- Journal:
- Neural networks
- Issue:
- Volume 161(2023)
- Issue Display:
- Volume 161, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 161
- Issue:
- 2023
- Issue Sort Value:
- 2023-0161-2023-0000
- Page Start:
- 228
- Page End:
- 241
- Publication Date:
- 2023-04
- Subjects:
- Efficient interpretability -- Interpretable reinforcement learning -- Saliency map
Neural computers -- Periodicals
Neural networks (Computer science) -- Periodicals
Neural networks (Neurobiology) -- Periodicals
Nervous System -- Periodicals
Ordinateurs neuronaux -- Périodiques
Réseaux neuronaux (Informatique) -- Périodiques
Réseaux neuronaux (Neurobiologie) -- Périodiques
Neural computers
Neural networks (Computer science)
Neural networks (Neurobiology)
Periodicals
006.32 - Journal URLs:
- http://www.sciencedirect.com/science/journal/08936080 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.neunet.2023.01.025 ↗
- Languages:
- English
- ISSNs:
- 0893-6080
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 6081.280800
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 26310.xml