SPM Preprint
SPM Preprint
D
eep neural networks have recently shown impressive clas-
sification performance on a diverse set of visual tasks.
When deployed in real-world (noise-prone) environments,
it is equally important that these classifiers satisfy robustness
guarantees: small perturbations applied to the samples should
not yield significant loss to the performance of the predictor.
The goal of this article is to discuss the robustness of deep
networks to a diverse set of perturbations that may affect the
samples in practice, including adversarial perturbations, ran-
dom noise, and geometric transformations. This article further
discusses the recent works that build on the robustness analysis
to provide geometric insights on the classifier’s decision sur-
face, which help in developing a better understanding of deep
networks. Finally, we present recent solutions that attempt to
increase the robustness of deep networks. We hope this review
article will contribute to shed ding light on the open research
challenges in the robustness of deep networks and stir interest
in the analysis of their fundamental properties.
Introduction
With the dramatic increase of digital data and the development
of new computing architectures, deep learning has been devel-
oping rapidly as a predominant framework for data representa-
tion that can contribute in solving very diverse tasks. Despite
©Istockphoto.com/zapp2photo this success, several fundamental properties of deep neural
networks are still not understood and have been the subject
of intense analysis in recent years. In particular, the robust-
ness of deep networks to various forms of perturbations has
received growing attention due to its importance when applied
to visual data. That path of work has been mostly initiated by
the illustration of the intriguing properties of deep networks
in [1], which are shown to be particularly vulnerable to very
small additive perturbations in the data, even if they achieve
impressive performance on complex visual benchmarks [2].
An illustration of the vulnerability of deep networks to small
additive perturbations can be seen in Figure 1. A dual phenom-
enon was observed in [3], where unrecognizable images to the
Digital Object Identifier 10.1109/MSP.2017.2740965
Date of publication: 13 November 2017 human eye are classified with high confidence by deep neural
Tr ∗ (x1) t ( f ) = E ^ r ) (x) Rh .
(3)
x~n
x1
f (x1) = 1 It is important to note that in our robustness setting, the perturbed
point Tr (x) need not belong to the support of the data distribution.
T Hence, while the focus of the risk in (1) is the accuracy on typical
images (sampled from n), the focus of the robustness computed
from (2) is instead on the distance to the “closest” image (poten-
FIGURE 2. Here, B denotes the decision boundary of the classifier tially outside the support of n) that changes the label of the clas-
between classes 1 and 2, and T denotes the set of perturbed versions
of x 1 (i.e., T = {Tr (x 1): r ! R} ), where we recall that R denotes the set
sifier. The risk and robustness hence capture two fundamentally
of admissible perturbations. The pointwise robustness at x 1 is defined as different properties of the classifier, as illustrated in “Robustness
the smallest perturbation in R that causes x 1 to change class. and Risk: A Toy Example.”
1–a
(a)
1+a a
–a
(a) (b)
(b)
FIGURE s2. (a) An example image of class 1. White pixels have value
FIGURE s1. (a) The images belonging to class 1 (vertical stripe and 1 + a, and black pixels have value a. (b) An example image of class –1.
positive bias) and (b) the images belonging to class 2 (horizontal White pixels have value 1 - a, and black pixels have value - a . The
stripe and negative bias). bias a is set to be very small, in such a way that it is imperceptible.
FIGURE 3. Universal perturbations computed for different deep neural network architectures. The pixel values are scaled for visibility. (a) CaffeNet,
(b) VGG-F, (c) VGG-16, (d) VGG-19, (e) GoogLeNet, and (f) ResNet-152.
(a) Wool (b) Indian Elephant (c) Indian Elephant (d) African Gray (e) Tabby (f) African Gray
(g) Common Newt (h) Carousel (i) Gray Fox (j) Macaw (k) Three-Toed Sloth (l) Macaw
FIGURE 4. Examples of natural images perturbed with the universal perturbation and their corresponding estimated labels with GoogLeNet. (a)–(h) Images
belonging to the ILSVRC 2012 validation set. (i)–(l) Personal images captured by a mobile phone camera. (Figure used courtesy of [22].)
FIGURE 5. (a) The original image. The remaining images are minimally perturbed images (along with the corresponding estimated label) that misclassify
the CaffeNet deep neural network. (b) Adversarial perturbation, (c) random noise, (d) semirandom noise with m = 1, 000, (e) universal perturbation, (f)
affine transformation. (Figure used courtesy of [17].)
–2.5
2.5
7.5
–100
–75
–50
–25
0
25
50
75
100
125
150
–150
–100
–50
50
100
150
200
FIGURE 7. The two-dimensional normal cross sections of the decision boundaries for three different classifiers near randomly chosen samples. The section is
spanned by the adversarial perturbation of the data point x (vertical axis) and a random vector in the tangent space to the decision boundary (horizontal axis). The
green region is the classification region of x. The decision boundaries with different classes are illustrated in different colors. Note the difference in range between
the x and y axes. (a) VGG-F (ImageNet), (b) LeNet (CIFAR), (c) LeNet (MNIST). (Figure used with permission from [18].)
Universal perturbations
The vulnerability of deep neural networks to universal (image-
agnostic) perturbations studied in [22] sheds light on another
aspect of the decision boundary: the correlations between
x different regions of the decision boundary, in the vicinity of
x different natural images. In fact, if the orientations of the deci-
sion boundary in the neighborhood of different data points
were uncorrelated, the best universal perturbation would cor-
respond to a random perturbation. This is refuted in [22], as
(a) (b)
the norm of the random perturbation required to fool 90%
FIGURE 8. The contours of two highly nonlinear functions (a) and of the images is ten times larger than the norm of universal
(b) with flat boundaries. Specifically, the contours in the green and yellow perturbations. Such correlations in the decision boundary are
regions represent the different (positive and negative) level sets of g (x) quantified in [22], as it is shown empirically that normal vec-
[where g (x ) = g 1 (x ) - g 2 (x ), the difference between class 1 and class 2
tors to the decision boundary at the vicinity of different data
score]. The decision boundary is defined as the region of the space where
g (x ) = 0 and is indicated with a solid black line. Note that, although g is points (or, equivalently, adversarial perturbations due to the
a highly nonlinear function in these examples, the decision boundaries orthogonality property in “Geometric Properties of Adver-
are flat. sarial Perturbations”) approximately span a low-dimensional
x x x x
FIGURE 9. Cross sections of the decision boundary in the vicinity of data point x. (a), (b), and (c) show decision boundaries with high curvature, while
(d) shows the decision boundary along a random normal section (with very small curvature). The correct class and the neighboring classes are colored
in green and orange, respectively. The boundaries between different classes are shown in solid black lines. The x and y axes have the same scale.
∗ ∗
radv ∗
radv v
radv v
v
x x x
(a) (b) (c)
FIGURE 10. The link between robustness and curvature of the decision boundary. When the decision boundary is (a) positively curved, small universal
perturbations are more likely to fool the classifier. (b) and (c) illustrate the case of a flat and negatively curved decision boundary, respectively.