TextTricker: Loss-based and gradient-based adversarial attacks on text classification models. (June 2020)

Record Type:: Journal Article
Title:: TextTricker: Loss-based and gradient-based adversarial attacks on text classification models. (June 2020)
Main Title:: TextTricker: Loss-based and gradient-based adversarial attacks on text classification models
Authors:: Xu, Jincheng
Du, Qingfeng
Abstract:: Abstract: Adversarial examples are generated by adding infinitesimal perturbations to legitimate inputs so that incorrect predictions can be induced into deep learning models. They have received increasing attention recently due to their significant values in evaluating and improving the robustness of neural networks. While adversarial attack algorithms have achieved notable advancements in the continuous data of images, they cannot be directly applied for discrete symbols such as text, where all the semantic and syntactic constraints in languages are expected to be satisfied. In this paper, we propose a white-box adversarial attack algorithm, TextTricker, which supports both targeted and non-targeted attacks on text classification models. Our algorithm can be implemented in either a loss-based way, where word perturbations are performed according to the change in loss, or a gradient-based way, where the expected gradients are computed in the continuous embedding space to restrict the perturbations towards a certain direction. We perform extensive experiments on two publicly available datasets and three state-of-the-art text classification models to evaluate our algorithm. The empirical results demonstrate that TextTricker performs notably better than baselines in attack success rate. Moreover, we discuss various aspects of TextTricker in details to provide a deep investigation and offer suggestions for its practical use. Highlights: We propose a novel algorithm, … (more)
Is Part Of:: Engineering applications of artificial intelligence. Volume 92(2020)
Journal:: Engineering applications of artificial intelligence
Issue:: Volume 92(2020)
Issue Display:: Volume 92, Issue 2020 (2020)
Year:: 2020
Volume:: 92
Issue:: 2020
Issue Sort Value:: 2020-0092-2020-0000
Page Start:
Page End:
Publication Date:: 2020-06
Subjects:: Adversarial attacks -- Text classification -- The loss-based implementation -- The gradient-based implementation
Engineering -- Data processing -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Ingénierie -- Informatique -- Périodiques
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
Artificial intelligence
Engineering -- Data processing
Expert systems (Computer science)
Periodicals
620.00285
Journal URLs:: http://www.sciencedirect.com/science/journal/09521976 ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.engappai.2020.103641 ↗
Languages:: English
ISSNs:: 0952-1976
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3755.704500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 13361.xml