Towards robust explanations for deep neural networks. (January 2022)

Record Type:: Journal Article
Title:: Towards robust explanations for deep neural networks. (January 2022)
Main Title:: Towards robust explanations for deep neural networks
Authors:: Dombrowski, Ann-Kathrin
Anders, Christopher J.
Müller, Klaus-Robert
Kessel, Pan
Abstract:: Highlights: We investigate how to enhance the resilience of explanations against manipulation. Explanations visualize the relevance of each input feature for the network's prediction. We develop a theoretical framework and derive bounds on the maximal change of an explanation. Based on these insights we present three different techniques to increase robustness. training with weight decay. smoothing activation functions. minimizing the Hessian of the network. Application of our methods shows significantly improved resilience of explanations. Graphical abstract: Abstract: Explanation methods shed light on the decision process of black-box classifiers such as deep neural networks. But their usefulness can be compromised because they are susceptible to manipulations. With this work, we aim to enhance the resilience of explanations. We develop a unified theoretical framework for deriving bounds on the maximal manipulability of a model. Based on these theoretical insights, we present three different techniques to boost robustness against manipulation: training with weight decay, smoothing activation functions, and minimizing the Hessian of the network. Our experimental results confirm the effectiveness of these approaches.
Is Part Of:: Pattern recognition. Volume 121(2022)
Journal:: Pattern recognition
Issue:: Volume 121(2022)
Issue Display:: Volume 121, Issue 2022 (2022)
Year:: 2022
Volume:: 121
Issue:: 2022
Issue Sort Value:: 2022-0121-2022-0000
Page Start:
Page End:
Publication Date:: 2022-01
Subjects:: Explanation method -- Saliency map -- Adversarial attacks -- Manipulation -- Neural networks,
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4
Journal URLs:: http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗
DOI:: 10.1016/j.patcog.2021.108194 ↗
Languages:: English
ISSNs:: 0031-3203
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 25035.xml