Quantifying the preferential direction of the model gradient in adversarial training with projected gradient descent. (July 2023)
- Record Type:
- Journal Article
- Title:
- Quantifying the preferential direction of the model gradient in adversarial training with projected gradient descent. (July 2023)
- Main Title:
- Quantifying the preferential direction of the model gradient in adversarial training with projected gradient descent
- Authors:
- Bigolin Lanfredi, Ricardo
Schroeder, Joyce D.
Tasdizen, Tolga - Abstract:
- Highlights: Input gradients of adversarially-trained robust models show a preferred direction. These gradients point more directly to the closest point of inaccurate classes. Generative adversarial networks estimate this direction and its alignment metric. Metric correlates to robustness for models trained with projected gradient descent. Enforcing the proposed alignment increases robustness of models. Abstract: Adversarial training, especially projected gradient descent (PGD), has proven to be a successful approach for improving robustness against adversarial attacks. After adversarial training, gradients of models with respect to their inputs have a preferential direction. However, the direction of alignment is not mathematically well established, making it difficult to evaluate quantitatively. We propose a novel definition of this direction as the direction of the vector pointing toward the closest point of the support of the closest inaccurate class in decision space. To evaluate the alignment with this direction after adversarial training, we apply a metric that uses generative adversarial networks to produce the smallest residual needed to change the class present in the image. We show that PGD-trained models have a higher alignment than the baseline according to our definition, that our metric presents higher alignment values than a competing metric formulation, and that enforcing this alignment increases the robustness of models.
- Is Part Of:
- Pattern recognition. Volume 139(2023)
- Journal:
- Pattern recognition
- Issue:
- Volume 139(2023)
- Issue Display:
- Volume 139, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 139
- Issue:
- 2023
- Issue Sort Value:
- 2023-0139-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-07
- Subjects:
- Robustness -- Robust models -- Gradient direction -- Gradient alignment -- Deep learning -- PGD -- Adversarial training -- GAN
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2023.109430 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 26769.xml