Adversarial attacks on text classification models using layer‐wise relevance propagation. Issue 9 (20th July 2020)
- Record Type:
- Journal Article
- Title:
- Adversarial attacks on text classification models using layer‐wise relevance propagation. Issue 9 (20th July 2020)
- Main Title:
- Adversarial attacks on text classification models using layer‐wise relevance propagation
- Authors:
- Xu, Jincheng
Du, Qingfeng - Abstract:
- Abstract: Due to the nested nonlinear structure inside neural networks, most existing deep learning models are treated as black boxes, and they are highly vulnerable to adversarial attacks. On the one hand, adversarial examples shed light on the decision‐making process of these opaque models to interrogate the interpretability. On the other hand, interpretability can be used as a powerful tool to assist in the generation of adversarial examples by affording transparency on the relative contribution of each input feature to the final prediction. Recently, a post‐hoc explanatory method, layer‐wise relevance propagation (LRP), shows significant value in instance‐wise explanations. In this paper, we attempt to optimize the recently proposed explanation‐based attack algorithms (EAAs) on text classification models with LRP. We empirically show that LRP provides good explanations and benefits existing EAAs notably. Apart from that, we propose a LRP‐based simple but effective EAA, LRPTricker . LRPTricker uses LRP to identify important words and subsequently performs typo‐based perturbations on these words to generate the adversarial texts. The extensive experiments show that LRPTricker is able to reduce the performance of text classification models significantly with infinitesimal perturbations as well as lead to high scalability.
- Is Part Of:
- International journal of intelligent systems. Volume 35:Issue 9(2020)
- Journal:
- International journal of intelligent systems
- Issue:
- Volume 35:Issue 9(2020)
- Issue Display:
- Volume 35, Issue 9 (2020)
- Year:
- 2020
- Volume:
- 35
- Issue:
- 9
- Issue Sort Value:
- 2020-0035-0009-0000
- Page Start:
- 1397
- Page End:
- 1415
- Publication Date:
- 2020-07-20
- Subjects:
- adversarial attacks -- layer‐wise relevance propagation -- text classification
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
006.3 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1098-111X ↗
https://www.hindawi.com/journals/ijis ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/int.22260 ↗
- Languages:
- English
- ISSNs:
- 0884-8173
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4542.310500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 13670.xml