SPAT: Semantic-Preserving Adversarial Transformation for Perceptually Similar Adversarial Examples

Abstract

Although machine learning models achieve high classification accuracy against benign examples, they are vulnerable to adversarial machine learning (AML) attacks which generate adversarial examples by adding well-crafted perturbations to the benign examples. The perturbations can be increased to enhance the attack success rate, however, if the perturbations are added without considering the semantic or perceptual similarity between the benign and adversarial examples, the attack can be easily perceived/detected. As such, there exists a trade-off between the attack success rate and the perceptual similarity. In this paper, we propose a novel Semantic-Preserving Adversarial Transformation (SPAT) framework which facilitates an advantageous trade-off between the two metrics. SPAT modifies the optimisation objective of an AML attack to include the goal of increasing the attack success rate as well as the goal of maintaining the perceptual similarity between benign and adversarial examples. Our experiments on a variety of datasets including CIFAR-10, GTSRB, and MNIST demonstrate that SPAT-transformed AML attacks achieve better perceptual similarity while maintaining the attack success rates as the conventional AML attacks.

Publication
European Conference on Artificial Intelligence (ECAI)

Related