0% found this document useful (0 votes)
5 views13 pages

SPM Preprint

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views13 pages

SPM Preprint

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Deep Learning for Visual Understanding

Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli,


and Pascal Frossard

The Robustness of Deep Networks


A geometrical perspective

D
eep neural networks have recently shown impressive clas-
sification performance on a diverse set of visual tasks.
When deployed in real-world (noise-prone) environments,
it is equally important that these classifiers satisfy robustness
guarantees: small perturbations applied to the samples should
not yield significant loss to the performance of the predictor.
The goal of this article is to discuss the robustness of deep
networks to a diverse set of perturbations that may affect the
samples in practice, including adversarial perturbations, ran-
dom noise, and geometric transformations. This article further
discusses the recent works that build on the robustness analysis
to provide geometric insights on the classifier’s decision sur-
face, which help in developing a better understanding of deep
networks. Finally, we present recent solutions that attempt to
increase the robustness of deep networks. We hope this review
article will contribute to shed ding light on the open research
challenges in the robustness of deep networks and stir interest
in the analysis of their fundamental properties.

Introduction
With the dramatic increase of digital data and the development
of new computing architectures, deep learning has been devel-
oping rapidly as a predominant framework for data representa-
tion that can contribute in solving very diverse tasks. Despite
©Istockphoto.com/zapp2photo this success, several fundamental properties of deep neural
networks are still not understood and have been the subject
of intense analysis in recent years. In particular, the robust-
ness of deep networks to various forms of perturbations has
received growing attention due to its importance when applied
to visual data. That path of work has been mostly initiated by
the illustration of the intriguing properties of deep networks
in [1], which are shown to be particularly vulnerable to very
small additive perturbations in the data, even if they achieve
impressive performance on complex visual benchmarks [2].
An illustration of the vulnerability of deep networks to small
additive perturbations can be seen in Figure 1. A dual phenom-
enon was observed in [3], where unrecognizable images to the
Digital Object Identifier 10.1109/MSP.2017.2740965
Date of publication: 13 November 2017 human eye are classified with high confidence by deep neural

50 IEEE Signal Processing Magazine | November 2017 | 1053-5888/17©2017IEEE


networks. The transfer of these deep networks to critical appli- to evaluate classifiers. The empirical test error provides an
cations that possibly consist in classifying high-stake infor- estimate of the classifier’s risk, defined as the probability of
mation is seriously challenged by the low robustness of deep misclassification, when considering samples from the data dis-
networks. For example, in the context of self-driving vehicles, tribution. Formally, let us define n to be a distribution defined
it is fundamental to accurately recognize cars, traffic signs, over images. The risk of a classifier f is equal to
and pedestrians, when these are affected by clutter, occlusions,
or even adversarial attacks. In medical imaging [4], it is also R ( f ) = xP~ ( f (x) ! y (x)), (1)
n

important to achieve high classification rates on potentially


perturbed test data. The analysis of state-of-the-art deep clas- where x and y (x) correspond, respectively, to the image
sifiers’ robustness to perturbation at test time is therefore an and its associated label. While the risk captures the error of
important step for validating the models’ reliability to unex- f on the data distribution n, it does not capture the robust-
pected (possibly adversarial) nuisances that might occur when ness to small arbitrary perturbations of data points. In visual
deployed in uncontrolled environments. In addition, a better classification tasks, it is desirable to learn classifiers that
understanding of the capabilities of deep networks in coping achieve robustness to small perturbations of the input; i.e.,
with data perturbation actually allows us to develop important the application of a small perturbation to images (e.g., addi-
insights that can contribute to developing yet better systems. tive perturbations on the pixel values or geometric transfor-
The fundamental challenges raised by the robustness of deep mation of the image) should not alter the estimated label of
networks to perturbations have led to a large number of impor- the classifier.
tant works in recent years. These works study empirically and Before going into more detail about robustness, we first
theoretically the robustness of deep networks to different types define some notations. Let X denote the ambient space where
of perturbations, such as adversarial perturbations, additive ran- images live. We denote by R the set of admissible perturba-
dom noise, structured transformations, or even universal pertur- tions. For example, when considering geometric perturbations,
bations. The robustness is usually measured as the sensitivity of R is set to be the group of geometric (e.g., affine) transfor-
the discrete classification function (i.e., the function that assigns mations under study. Alternatively, if we are to measure the
a label to each image) to such perturbations. While robustness robustness to arbitrary additive perturbations, we set R = X.
analysis is not a new problem, we provide an overview of the For r ! R, we define Tr: X " X to be the perturbation opera-
recent works that propose to assess the vulnerability of deep tor by r; i.e., for a data point x ! X, Tr (x) denotes the image
network architectures. In addition to quantifying the robustness x perturbed by r. Armed with these notations, we define the
of deep networks to various forms of perturbations, the analy- minimal perturbation changing the label of the classifier, at x,
sis of robustness has further contributed to developing important as follows:
insights on the geometry of the complex decision boundary of
such classifiers, which remain hardly understood due to the very r ) (x) = argmin r R subject to f (Tr (x)) ! f (x), (2)
r!R

high dimensionality of the problems that they address. In fact,


the robustness properties of a classifier are strongly tied to the where · R is a metric on R. For notation simplicity, we omit
geometry of the decision boundaries. For example, the high insta- the dependence of r ) (x) on f, R, d, and operator T. Moreover,
bility of deep neural networks to adversarial perturbations shows when the image x is clear from the context, we will use r ) to refer
that data points reside extremely close to the classifier’s decision to r ) (x) . See Figure 2 for an illustration. The pointwise robust-
boundary. The study of robustness is, therefore, not only interest- ness of f at x is then measured by r ) (x) R . Note that larger
ing from the practical perspective of the system’s reliability but values of · R indicate a higher robustness at x. While this
has a more fundamental component that allows “understanding” definition of robustness considers the smallest perturbation r ) (x)
of the geometric properties of classification regions and derives (with respect to the metric · Rh that causes the classifier f to
insights toward the improvement of current ­architectures.
This overview article has multiple goals. First, it provides
an accessible review of the recent works in the analysis of
the robustness of deep neural network classifiers to different
forms of perturbations, with a particular emphasis on image
analysis and visual understanding applications. Second, it
presents connections between the robustness of deep networks
and the geometry of the decision boundaries of such classi-
fiers. Third, the article discusses ways to improve the robust-
ness in deep networks architectures and finally highlights (a) (b) (c)
some of the important open problems.
FIGURE 1. An example of an adversarial perturbations in state-of-the-art
neural networks. (a) The original image that is classified as a “whale,” (b)
Robustness of classifiers the perturbed image classified as a “turtle,” and (c) the corresponding
In most classification settings, the proportion of misclassified adversarial perturbation that has been added to the original image to fool
samples in the test set is the main performance metric used a state-of-the-art image classifier [5].

IEEE Signal Processing Magazine | November 2017 | 51


change the label at x, other works have instead adopted slightly
different definitions, where a “sufficiently small’’ perturbation
f (x2) = 2
is sought (instead of the minimal one) [7]–[9]. To measure the
x2
B global robustness of a classifier f, one can compute the expecta-
tion of r ) (x) R over the data distribution [1], [10]. That is, the
global robustness t ( f ) is defined as follows:

Tr ∗ (x1) t ( f ) = E ^ r ) (x) Rh .
(3)
x~n

x1
f (x1) = 1 It is important to note that in our robustness setting, the perturbed
point Tr (x) need not belong to the support of the data distribution.
T Hence, while the focus of the risk in (1) is the accuracy on typical
images (sampled from n), the focus of the robustness computed
from (2) is instead on the distance to the “closest” image (poten-
FIGURE 2. Here, B denotes the decision boundary of the classifier tially outside the support of n) that changes the label of the clas-
between classes 1 and 2, and T denotes the set of perturbed versions
of x 1 (i.e., T = {Tr (x 1): r ! R} ), where we recall that R denotes the set
sifier. The risk and robustness hence capture two fundamentally
of admissible perturbations. The pointwise robustness at x 1 is defined as different properties of the classifier, as illustrated in “Robustness
the smallest perturbation in R that causes x 1 to change class. and Risk: A Toy Example.”

Robustness and Risk: A Toy Example


To illustrate the general concepts of robustness and risk of It is easy to see that a linear classifier can perfectly sepa-
classifiers, we consider the simple binary classification task rate the two classes, thus achieving zero risk (i.e., R (f ) = 0).
illustrated in Figure S1, where the goal is to discriminate Note, however, that such a classifier only achieves zero risk
between images representing vertical and horizontal stripes. because it captures the bias but fails to distinguish between
In addition to the orientation of the stripe that separates the the images based on the orientation of the stripe. Hence,
two classes, a very small positive bias is added to pixels of despite being zero risk, this classifier is highly unstable to
first-class images and subtracted from the pixels of the imag- additive perturbation, as it suffices to perturb the bias of the
es in the second class. This bias is chosen to be very small, in image (i.e., by adding a very small value to all pixels) to
such a way that it is imperceptible to humans.; see Figure S2 cause misclassification. On the other hand, a more complex
for example images of class 1 and 2 with the pixel values, classifier that captures the orientation of the stripe will be
where a denotes the bias. robust to small perturbations (while equally achieving zero
risk), as changing the label would require changing the
direction of the stripe, which is the most visual (and natural)
concept that separates the two classes.

1–a
(a)
1+a a

–a

(a) (b)
(b)
FIGURE s2. (a) An example image of class 1. White pixels have value
FIGURE s1. (a) The images belonging to class 1 (vertical stripe and 1 + a, and black pixels have value a. (b) An example image of class –1.
positive bias) and (b) the images belonging to class 2 (horizontal White pixels have value 1 - a, and black pixels have value - a . The
stripe and negative bias). bias a is set to be very small, in such a way that it is imperceptible.

52 IEEE Signal Processing Magazine | November 2017 |


Observe that classification robustness is strongly related over, it computes targeted adversarial perturbations, where the
to support vector machine (SVM) classifiers, whose goal is to target label is known.
maximize the robustness, defined as the margin between sup- ■■ Fast gradient sign (FGS) [11]: This solution estimates an
port vectors. Importantly, the max-margin classifier in a given untargeted adversarial perturbation by going in the direction
family of classifiers might, however, still not achieve robust- of the sign of gradient of the loss function:
ness (in the sense of high t ( f )). An illustration is provided in
“Robustness and Risk: A Toy Example,” where a no zero-risk e sign ^ d x J (x, y (x), i) h,
linear classifier—in particular, the max-margin classifier—
achieves robustness to perturbations. Our focus in this article where J, the loss function, is used to train the neural network and
is turned toward assessing the robustness of the family of deep i denotes the model parameters. While efficient, this one-step
neural network classifiers that are used in many visual recog- algorithm provides a coarse approximation to the solution of the
nition tasks. optimization problem in (4) for p = 3.
■■ DeepFool [5]: This algorithm minimizes (4) through an itera-
Perturbation forms tive procedure, where each iteration involves the linearization
of the constraint. The linearized (constrained) problem is
Robustness to additive perturbations solved in closed form at each iteration, and the current esti-
We first start by considering the case where the perturbation mate is updated; the optimization procedure terminates when
operator is simply additive; i.e., Tr (x) = x + r. In this case, the the current estimate of the perturbation fools the classifier. In
magnitude of the perturbation can be measured with the , p norm practice, DeepFool provides a tradeoff between the accuracy
of the minimal perturbation that is necessary to change the label and efficiency of the two previous approaches [5].
of a classifier. According to (2), the robustness to additive pertur- In addition to the aforementioned optimization meth-
bations of a data point x is defined as ods, several other approaches have recently been proposed
to compute adversarial perturbations, see, e.g., [9], [12], and
min
r!R
r p subject to f (x + r) ! f (x).(4) [13]. Different from the previously mentioned gradient-based
techniques, the recent work in [14] learns a network (the
Depending on the conditions that one sets on the set R that sup- adversarial transformation network) to efficiently generate a
ports the perturbations, the additive model leads to different set of perturbations with a large diversity, without requiring
forms of robustness. the computation of the gradients.
Using the aforementioned optimization techniques, one
Adversarial perturbations can compute the robustness of classifiers to additive adver-
We first consider the case where the additive perturbations are sarial perturbations. Quite surprisingly, deep networks are
unconstrained (i.e., R = X ) . The perturbation obtained by solv- extremely vulnerable to such additive perturbations; i.e.,
ing (4) is often referred to as an adversarial perturbation, as it small and even imperceptible adversarial perturbations can
corresponds to the perturbation that an adversary (having full be computed to fool them with high probability. For example,
knowledge of the model) would apply to change the label of the the average perturbations required to fool the CaffeNet [15]
classifier, while causing minimal changes to the original image. and GoogleNet [16] architectures on the ILSVRC 2012 task
The optimization problem in (4) is nonconvex, as the con- [17] are 100 times smaller than the typical norm of natural
straint involves the (potentially highly complex) classification images [5] when using the , 2 norm. The high instability of
function f. Different techniques exist to approximate adversarial deep neural networks to adversarial perturbations, which
perturbations. In the following, we briefly mention some of the was first highlighted in [1], shows that these networks rely
existing algorithms for computing adversarial perturbations: heavily on proxy concepts to classify objects, as opposed to
■■ Regularized variant [1]: The method in [1] computes adver- strong visual concepts typically used by humans to distin-
sarial perturbations by solving a regularized variant of the guish between objects.
problem in (4), given by To illustrate this idea, we consider once again the toy clas-
sification example (see “Robustness and Risk: A Toy Example”),
min
r
c r p + J (x + r, yu , i), (5) where the goal is to classify images based on the orientation of
the stripe. In this example, linear classifiers could achieve a per-
where yu is a target label of the perturbed sample, J is a loss func- fect recognition rate by exploiting the imperceptibly small bias
tion, c is a regularization parameter, and i is the model param- that separates the two classes. While this proxy concept achieves
eters. In the original formulation [1], an additional constraint zero risk, it is not robust to perturbations: one could design an
is added to guarantee x + r ! [0, 1], which is omitted in (5) additive perturbation that is as simple as a minor variation of the
for simplicity. To solve the optimization problem in (5), a line bias, which is sufficient to induce data misclassification. On the
search is performed over c to find the maximum c 2 0 for same line of thought, the high instability of classifiers to additive
which the minimizer of (5) satisfies f (x + r) = yu . While lead- perturbations observed in [1] suggests that deep neural networks
ing to very accurate estimates, this approach can be costly to potentially capture one of the proxy concepts that separate the
compute on high-dimensional and large-scale data sets. More- different classes. Through a quantitative analysis of polynomial

IEEE Signal Processing Magazine | November 2017 | 53


classifiers, [10] suggests that higher-degree With the dramatic increase Note that, when m = 1, this semirandom
classifiers tend to be more robust to per- noise regime precisely coincides with the
turbations, as they capture the “stronger”
of digital data and the random noise regime, whereas m = d corre-
(and more visual) concept that separates the development of new sponds to the adversarial perturbation regime
classes (e.g., the orientation of the stripe in computing architectures, defined previously. For this generalized noise
Figure S1 in “Robustness and Risk: A Toy deep learning has been regime, a precise relation between the robust-
Example”). For neural networks, however, the developing rapidly as a ness to semirandom and adversarial pertur-
relation between the flexibility of the archi- predominant framework bation exists [18], as it is shown that
tecture (e.g., depth and breadth) and adver-
for data representation = Hc 2m
sarial robustness is not well understood and r S* (x) d r * (x) .
that can contribute in 2 m adv
remains an open problem.
solving very diverse tasks. This result shows in particular that, even
Random noise when the dimension m is chosen as a small
In the random noise regime, data points are fraction of d, it is still possible to find
perturbed by noise having a random direction in the input space. small perturbations that cause data misclassification. In other
Unlike the adversarial case, the computation of random noise words, classifiers are not robust to semirandom noise that is
does not require knowledge of the classifier; it is therefore crucial only mildly adversarial and overwhelmingly random [18]. This
for state-of-the-art classifiers to be robust to this noise regime. We implies that deep networks can be fooled by very diverse small
measure the pointwise robustness to random noise by setting R perturbations, as these can be found along random subspaces
to be a direction sampled uniformly at random from the , 2 unit of dimension m % d.
sphere S d -1 in X (where d denotes the dimension of X ) . There-
fore, (4) becomes Robustness to structured transformations
In visual tasks, it is not only crucial to have classifiers that are
r v* (x) = argmin r 2 subject to f (x + r) ! f (x), (6) robust against additive perturbations as described previously.
r ! {av: a ! R}
It is also equally important to achieve invariance to structured
where v is a direction sampled uniformly at random from the nuisance variables such as illumination changes, occlusions, or
unit sphere S d -1 . The pointwise robustness is then defined as standard local geometric transformations of the image. Spe-
the , 2 norm of the perturbation, i.e., r v* (x) 2 . cifically, when images undergo such structured deformations,
The robustness of classifiers to random noise has previously it is desirable that the estimated label remains the same.
been studied empirically in [1] and theoretically in [10] and One of the main strengths of deep neural network clas-
[18]. Empirical investigation suggests that state-of-the-art clas- sifiers with respect to traditional shallow classifiers is that
sifiers are much more robust to random noise than to adversar- the former achieve higher levels of invariance [19] to trans-
ial perturbations, i.e., the norm of the noise r v* (x) required to formations. To verify this claim, several empirical works
change the label of the classifier can be several orders of mag- have been introduced. In [6], a formal method is proposed
nitudes larger than that of the adversarial perturbation. This that leverages the generalized robustness definition of (2)
result is confirmed theoretically, as linear classifiers in [10] to measure the robustness of classifiers to arbitrary transfor-
and nonlinear classifiers in [18] are shown to have a robustness mation groups. The robustness to structured transformations
to random noise that behaves as is precisely measured by setting the admissible perturba-
tion space R to be the set of transformations (e.g., trans-
r v*(x) 2 = H ` d r adv
* (x) j
2 lations, rotations, dilation) and the perturbation operator T
of (2) to be the warping operator transforming the coordi-
with high probability, where r adv * (x)
2 denotes the robustness nates of the image. In addition, · R is set to measure the
to adversarial perturbations [(4) with R = X ] . In other words, change in appearance between the original and transformed
this result shows that, in high-dimensional classification set- images. Specifically, · R is defined to be the length of
tings (i.e., large d ), classifiers can be robust to random noise, the shortest path on the nonlinear manifold of transformed
even if the pointwise adversarial robustness of the classifier is images T = {Tr (x): r ! R} . Using this approach, it is pos-
very small. sible to quantify the amount of change that the image should
undergo to cause the classifier to make the wrong decision.
Semirandom noise Despite improving the invariance over shallow networks,
Finally, the semirandom noise regime generalizes this addi- the method in [6] shows that deep classifiers are still not
tive noise model to random subspaces S of dimension m # d. robust to sufficiently small deformations on simple visual
Specifically, in this perturbation regime, an adversarial pertur- classification tasks. In [20], the authors assess the robustness
bation is sought within a random subspace S of dimension m. of face recognition deep networks to physically realizable
That is, the semirandom noise is defined as follows: structured perturbations. In particular, wearing e­ yeglass
frames is shown to cause state-of-the-art face-recognition
r S* (x) = argmin r 2 subject to f (x + r) ! f (x). (7) algorithms to misclassify. In [7], the robustness to other
r!S

54 IEEE Signal Processing Magazine | November 2017 |


forms of complex perturbations is tested, and state-of-the-art ture important security and reliability properties of classifiers,
deep networks are shown once again to be unstable to these and 3) they show important properties on the geometry of the
perturbations. An empirical analysis of the ability of cur- decision boundary of the classifier.
rent convolutional neural networks (CNNs) to manage loca- In [22], deep networks are shown to be surprisingly vulner-
tion and scale variability is proposed in [21]. It is shown, in able to universal (image-agnostic) perturbations. Specifically,
particular, that CNNs are not very effective in factoring out a universal perturbation v can be defined as the minimal per-
location and scale variability, despite the popular belief that turbation that fools a large fraction of the data points sampled
the convolutional architecture and the local spatial pooling from the data distribution n , i.e.,
provides invariance to such representations. The aforemen-
tioned works show that, just as state-of-the-art deep neu- v = argmin r p subject to xP~ ( f (x + r) ! f (x)) $ 1 - e, (8)
n
r
ral networks have been observed to be unstable to additive
unstructured perturbations, such modern classifiers are not where e controls the fooling rate of the universal perturba-
robust to perturbations even when severely restricting the set tion. Unlike adversarial perturbations that target to fool a
of possible transformations of the image. specific data point, universal perturbations attempt to fool
most images sampled from the natural images distribu-
Universal additive perturbations tion n. Specifically, by adding this single (image-agnostic)
All of the previous definitions capture different forms of perturbation to a natural image, the label estimated by the
robustness, but they all rely on the computation of data-spe- deep neural network will be changed with high probability.
cific perturbations. Specifically, they consider the necessary In [22], an algorithm is provided to compute such univer-
change that should be applied to specific samples to change the sal perturbations; these perturbations are further shown to
decision of the classifier. More generally, one might be inter- be quasi-imperceptible while fooling state-of-the-art deep
ested to understand if classifiers are also vulnerable to generic networks on unseen natural images with probability edg-
(data and network agnostic) perturbations. The analysis of the ing 80%. Specifically, the , p norm of these perturbations
robustness to such perturbations is interesting from several is at least one order of magnitude smaller than the norm of
perspectives: 1) these perturbations might not require the pre- natural images but causes most perturbed images to be mis-
cise knowledge of the classifier under test, 2) they might cap- classified. Figure 3 illustrates examples of scaled ­universal

(a) (b) (c)

(d) (e) (f)

FIGURE 3. Universal perturbations computed for different deep neural network architectures. The pixel values are scaled for visibility. (a) CaffeNet,
(b) VGG-F, (c) VGG-16, (d) VGG-19, (e) GoogLeNet, and (f) ResNet-152.

IEEE Signal Processing Magazine | November 2017 | 55


perturbations computed for different deep neural networks, little information about the actual model to craft successful
and Figure 4 illustrates examples of perturbed images. When perturbations [24].
added to the original images, a universal perturbation is Figure 5 illustrates a summary of the different types of
quasi-imperceptible but causes most images to be misclassi- perturbations considered in this section on a sample image.
fied. Note that adversarial perturbations computed using the As can be seen, the classifier is not robust to slight perturba-
algorithms described in the section “Adversarial Perturba- tions of the image (for most additive perturbations) and natu-
tions” are not ­universal across data points, as shown in [22]. ral geometric transformations of the image.
That is, adversarial perturbations only generalize mildly to
unseen data points, for a fixed norm comparable to that of Geometric insights from robustness
universal perturbations. The study of robustness allows us to derive insights about
Universal perturbations are further shown in [22] to trans- the classifiers and, more precisely, about the geometry of the
fer well across different architectures; a perturbation com- classification function acting on the high-dimensional input
puted for a given network is also very likely to fool another space. We recall that f : X " {1, f, C} denotes our C-class
network on most natural images. In that sense, such pertur- classifier, and we denote by g 1, f, g C the C probabilities
bations are doubly universal, as they generalize well across associated to each class by the classifier. Specifically, for a
images and architectures. Note that this property is shared given x ! X, f (x) is assigned to the class having a maximal
with adversarial perturbations, as the latter perturbations score; i.e., f (x) = argmax i {g i (x)}. For deep neural networks,
have been shown to transfer well across different models the functions g i represent the outputs of the last layer in the
(with potentially different architectures) [1], [23]. The exis- network (generally the softmax layer). Note that the classifier
tence of general-purpose perturbations can be very problem- f can be seen as a mapping that partitions the input space
atic from a safety perspective, as an attacker might need very X into classification regions, each of which has a constant

(a) Wool (b) Indian Elephant (c) Indian Elephant (d) African Gray (e) Tabby (f) African Gray

(g) Common Newt (h) Carousel (i) Gray Fox (j) Macaw (k) Three-Toed Sloth (l) Macaw

FIGURE 4. Examples of natural images perturbed with the universal perturbation and their corresponding estimated labels with GoogLeNet. (a)–(h) Images
belonging to the ILSVRC 2012 validation set. (i)–(l) Personal images captured by a mobile phone camera. (Figure used courtesy of [22].)

Pomeranian Marmoset Marmoset Marmoset Mosquito Net Persian Cat


(a) (b) (c) (d) (e) (f)

FIGURE 5. (a) The original image. The remaining images are minimally perturbed images (along with the corresponding estimated label) that misclassify
the CaffeNet deep neural network. (b) Adversarial perturbation, (c) random noise, (d) semirandom noise with m = 1, 000, (e) universal perturbation, (f)
affine transformation. (Figure used courtesy of [17].)

56 IEEE Signal Processing Magazine | November 2017 |


estimated label (i.e., f (x) is constant for each such region). The
decision boundary B of the classifier is defined as the union Geometric Properties
of the boundaries of such classification regions (see Figure 2). of Adversarial Perturbations
Adversarial perturbations Observation
We first focus on additive adversarial perturbations and Let x ! X and r adv
)
(x) be the adversarial perturbation,
highlight their relation with the geometry of the decision defined as the minimizer of (4), with p = 2 and R = X.
boundary. This link relies on the simple observation shown Then, we have the following:
in “Geometric Properties of Adversarial Perturbations.” The 1) r adv
)
(x) 2 measures the Euclidean distance from x to
two geometric properties are illustrated in Figure 6. Note the closest point on the decision boundary B.
that these geometric properties are specific to the , 2 norm. 2) The vector r adv)
(x) is orthogonal to the decision
The high instability of classifiers to adversarial perturba- boundary of the classifier, at x + r adv
)
(x ).
tions, which we highlighted in the previous section, shows
that natural images lie very closely to the classifier’s decision
boundary. While this result is key to understanding the geom-
etry of the data points with regard to the classifier’s decision
boundary, it does not provide any insights on the shape of
the decision boundary. A local geometric description of the
decision boundary (in the vicinity of x) is rather captured by
the direction of r adv
* (x), due to the orthogonality property of

adversarial perturbations (highlighted in “Geometric Proper- B


ties of Adversarial Perturbations”). In [18] and [25], these geo-
metric properties of adversarial perturbations are leveraged
to visualize typical cross sections of the decision boundary at
the vicinity of the data points. Specifically, a two-dimensional
normal section of the decision boundary is illustrated, where
the sectioning plane is spanned by the ­adversarial perturba- ∗
radv x
tion (normal to the decision boundary) and a random vector
in the tangent space. Examples of normal sections of decision
boundaries are illustrated in Figure 7.
Observe that the decision boundaries of state-of-the-art
deep neural networks have a very low curvature on these
two-dimensional cross sections (note the difference between FIGURE 6. r adv
)
denotes the adversarial perturbation of x (with p = 2).
the x and y axis). In other words, these plots suggest that the Note that r adv
)
is orthogonal to the decision boundary B and r adv)
2 =
decision boundary at the vicinity of x can be locally well dist (x, B).

2.5 12.5 0.5


2
1.5 10 0.25
1
7.5 0
0.5
0 5 0.25
–0.5
2.5 –0.5
–1
–1.5 x
0 0.75
–2 x
x
–2.5 –2.5 –1
–5

–2.5

2.5

7.5
–100
–75
–50
–25
0
25
50
75
100
125
150

–150

–100

–50

50

100

150

200

(a) (b) (c)

FIGURE 7. The two-dimensional normal cross sections of the decision boundaries for three different classifiers near randomly chosen samples. The section is
spanned by the adversarial perturbation of the data point x (vertical axis) and a random vector in the tangent space to the decision boundary (horizontal axis). The
green region is the classification region of x. The decision boundaries with different classes are illustrated in different colors. Note the difference in range between
the x and y axes. (a) VGG-F (ImageNet), (b) LeNet (CIFAR), (c) LeNet (MNIST). (Figure used with permission from [18].)

IEEE Signal Processing Magazine | November 2017 | 57


approximated by a hyperplane passing through x + r adv * (x) on all cross sections. That is, there exist directions in which
with the normal vector r adv* (x) . In [11], it is hypothesized that the boundary is very curved. Figure 9 provides some illustra-
state-of-the-art classifiers are “too linear,” leading to decision tions of such cross sections, where the decision boundary has
boundaries with very small curvature and further explaining large curvature and therefore significantly departs from the
the high instability of such classifiers to adversarial perturba- first-order linear approximation, suggested by the flatness of
tions. To motivate the linearity hypothesis of deep networks, the decision boundary on random sections in Figure 7. Hence,
the success of the FGS method (which is exact for linear clas- these visualizations of the decision boundary strongly suggest
sifiers) in finding adversarial perturbations is invoked. How- that the curvature along a small set of directions can be very
ever, some recent works challenge this linearity hypothesis; large and that the curvature is relatively small along random
for example, in [26], the authors show that there exist adver- directions in the input space. Using a numerical computation
sarial perturbations that cannot be explained with this hypoth- of the curvature, the sparsity of the curvature profile is empir-
esis, and, in [27], the authors provide a new explanation based ically verified in [28] for deep neural networks, and the direc-
on the tilting of the decision boundary with respect to the data tions where the decision boundary is curved are further shown
manifold. We stress here that the low curvature of the decision to play a major role in explaining the robustness properties
boundary does not, in general, imply that the function learned of classifiers. In [29], the authors provide a complementary
by the deep neural network (as a function of the input image) analysis on the curvature of the decision boundaries induced
is linear, or even approximately linear. Figure 8 shows illustrative by deep networks and show that the first principal curvatures
examples of highly nonlinear functions resulting in flat deci- increase exponentially with the depth of a random neural net-
sion boundaries. Moreover, it should be noted that, while the work. The analyses of [28] and [29] hence suggest that the
decision boundary of deep networks is very flat on random curvature profile of deep networks is highly sparse (i.e., the
two-dimensional cross sections, these boundaries are not flat decision boundaries are almost flat along most directions) but
can have a very large curvature along a few directions.

Universal perturbations
The vulnerability of deep neural networks to universal (image-
agnostic) perturbations studied in [22] sheds light on another
aspect of the decision boundary: the correlations between
x different regions of the decision boundary, in the vicinity of
x different natural images. In fact, if the orientations of the deci-
sion boundary in the neighborhood of different data points
were uncorrelated, the best universal perturbation would cor-
respond to a random perturbation. This is refuted in [22], as
(a) (b)
the norm of the random perturbation required to fool 90%
FIGURE 8. The contours of two highly nonlinear functions (a) and of the images is ten times larger than the norm of universal
(b) with flat boundaries. Specifically, the contours in the green and yellow perturbations. Such correlations in the decision boundary are
regions represent the different (positive and negative) level sets of g (x) quantified in [22], as it is shown empirically that normal vec-
[where g (x ) = g 1 (x ) - g 2 (x ), the difference between class 1 and class 2
tors to the decision boundary at the vicinity of different data
score]. The decision boundary is defined as the region of the space where
g (x ) = 0 and is indicated with a solid black line. Note that, although g is points (or, equivalently, adversarial perturbations due to the
a highly nonlinear function in these examples, the decision boundaries orthogonality property in “Geometric Properties of Adver-
are flat. sarial Perturbations”) approximately span a low-dimensional

x x x x

(a) (b) (c) (d)

FIGURE 9. Cross sections of the decision boundary in the vicinity of data point x. (a), (b), and (c) show decision boundaries with high curvature, while
(d) shows the decision boundary along a random normal section (with very small curvature). The correct class and the neighboring classes are colored
in green and orange, respectively. The boundaries between different classes are shown in solid black lines. The x and y axes have the same scale.

58 IEEE Signal Processing Magazine | November 2017 |


subspace. It is conjectured that the existence of universal per- nant classes is attributed to the large volumes of classifica-
turbations fooling classifiers for most natural images is part- tion regions corresponding to dominant labels in the input
ly due to the existence of such a low-dimensional subspace space X: in fact, images sampled uniformly at random from
that captures the correlations among different regions of the the Euclidean sphere aS d -1 of the input space X (where the
decision boundary. In fact, this subspace “collects” normals radius a is set to reflect the typical norm of natural imag-
to the decision boundary in different regions, and perturba- es) are classified as one of these dominant labels. Hence,
tions belonging to this subspace are therefore likely to fool such dominant labels represent high-volume “oceans” in the
other data points. This observation implies that the decision image space; universal perturbations therefore tend to fool
boundaries created by deep neural networks are not suffi- images into such target labels, as these generally result in
ciently “diverse,” despite the very large number of parameters smaller fooling perturbations. It should be noted that these
in modern deep neural networks. dominant labels are classifier specific and are not a result of
A more thorough analysis is provided in [30], where uni- the visual properties of the images in the class.
versal perturbations are shown to be tightly related to the cur- To further understand the geometrical properties of classi-
vature of the decision boundary in the vicinity of data points. fication regions, we note that, just like natural images, random
Specifically, the existence of universal images are strongly vulnerable to adversar-
perturbations is attributed to the existence ial perturbations. That is, the norm of the
of common directions where the decision
The study of robustness smallest adversarial perturbation needed to
boundary is positively curved in the vicin- allows us to derive insights change the label of a random image (sam-
ity of most natural images. Figure 10 intui- about the classifiers and, pled from X ) is several orders of magnitude
tively illustrates the link between positive more precisely, about smaller than the norm of the image itself.
curvature and vulnerability to perturba- the geometry of the This observation suggests that classification
tions; the required perturbation to change classification function regions are “hollow” and that most of their
the label (along a fixed direction v) of the mass occurs at the boundaries. In [28], fur-
classifier is smaller if the decision bound-
acting on the high- ther topological properties of classification
ary is positively curved, than if the deci- dimensional input space. regions are observed; in particular, these
sion boundary is flat (or negatively curved). regions are shown empirically to be con-
With this geometric perspective, universal perturbations cor- nected. In other words, each classification region in the input
respond exactly to directions where the decision boundary is space X is made up of a single connected (possibly complex)
positively curved in the vicinity of most natural images. As region, rather than several disconnected regions.
shown in [30], this geometric explanation of universal per- We have discussed in this section that the properties and
turbations suggests a new algorithm to compute such pertur- optimization methods derived to analyze the robustness
bations as well as to explain several properties, such as the properties of classifiers allow us to derive insights on the
diversity and transferability of universal perturbations. geometry of the classifier. In particular, through visualiza-
tions, we have seen that the decision boundaries on normal
Classification regions random sections have very low curvature, while being very
The robustness of classifiers is not only related to the geom- curved along a few directions of the input space. Moreover,
etry of the decision boundary, but it is also strongly tied to the high vulnerability of state-of-the-art deep networks to
the classification regions in the input space X. The classifi- universal perturbations suggests that the decision bound-
cation region associated to class c ! {1, f, C} corresponds aries of such networks do not have sufficient diversity. To
to the set of points x ! X such that f (x) = c. The study of improve the robustness to such perturbations, it is therefore
universal perturbations in [22] has shown the existence of key to “diversify” the decision boundaries of the network
dominant labels, with universal perturbations mostly fooling and leverage the large number of parameters that define the
natural images into such labels. The existence of such domi- neural network.

∗ ∗
radv ∗
radv v
radv v
v
x x x
(a) (b) (c)

FIGURE 10. The link between robustness and curvature of the decision boundary. When the decision boundary is (a) positively curved, small universal
perturbations are more likely to fool the classifier. (b) and (c) illustrate the case of a flat and negatively curved decision boundary, respectively.

IEEE Signal Processing Magazine | November 2017 | 59


Improving robustness (i.e., the region where the nonlinear activation function is flat)
An important objective of the analysis of robustness is to contrib- [47]. The networks learned using this approach are shown to
ute to the design of better and more reliable systems. We next significantly im­­prove in terms of robustness on a simple digit
summarize some of the recent attempts that have been made to recognition classification task, without losing significantly in
render systems more robust to different forms of p­ erturbations. terms of accuracy. In [38], the authors propose to improve the
robustness by using distillation, a technique first introduced
Improving the robustness to adversarial perturbations in [39] for transferring knowledge from larger architectures to
We first describe the methods that have been proposed to smaller ones. However, [40] shows that, when using more elab-
construct deep networks with better robustness to adversarial orate algorithms to compute perturbations, this approach fails
perturbations, following the papers [1], [9] that originally high- to improve the robustness. In [41], a regularization scheme is
lighted the vulnerability of these classifiers. The straightfor- introduced for improving the network’s sensitivity to perturba-
ward approach, which consists of adding perturbed images to tions by constraining the Lipschitz constant of the network.
the training set and fine-tuning the network, has been shown In [42], an information-theoretic loss function is used to train
to be mildly effective against newly com­­ stochastic neural networks; the result-
puted adversarial perturbations [5]. To ing classifiers are shown to be more
further improve the robustness, it is The importance of analyzing robust to adversarial perturbations than
natural to consider the Jacobian matrix the vulnerability of deep neural their deterministic counterpart. The
2g/2x of the model (with gthe last layer networks to perturbations increased robustness is intuitively due to
of the neural network) and ensure that the randomness of the neural network,
therefore goes beyond the which maps an input to a distribution
all of the elements in the matrix are
sufficiently small. Following this idea, practical security implications, of features; attacking the network with
the authors of [31] consider a modi- as it further reveals crucial a small designed perturbation therefore
fied objective function, where a term is geometric properties of becomes harder than for deterministic
added to penalize the Jacobians of the deep networks. neural networks.
function computed by each layer with While all of these methods are
respect to the previous layer. This has shown to yield some improvements on
the effect of learning smooth functions with respect to the the robustness of deep neural networks, the design of robust
input and thus learn more robust classifiers. In [32], a robust visual classifiers on challenging classification tasks (e.g.,
optimization formulation is considered for training deep ImageNet) is still an open problem. Moreover, while the pre-
neural networks. Specifically, a minimization-maximiza- viously mentioned methods provide empirical results show-
tion approach is proposed, where the loss is minimized over ing the improvement in robustness with respect to one or a
worst-case examples, rather than only on the original data. subset of adversarial generation techniques, it is necessary
That is, the following minimization-maximization training in many applications to design robust networks against all
procedure is used to train the network: adversarial attacks. To do so, we believe it is crucial to derive
formal certificates on the robustness of newly proposed net-
N
works, as it is practically impossible to test against all pos-
min / max J (x i + ri, y i, i), (9)
i
i =1
ir !U
sible attacks, and we see this as an important future work
in this area.
where i, N, and U denote, respectively, the parameters of the Although there is currently no method to effectively (and
network, the number of training points, and the set of plausible provably) combat adversarial perturbations on large-scale
perturbations; and y i denotes the label of x i . The set U is data sets, several studies [42]–[44] have recently considered
generally set to be the , 2 or , 3 ball centered at zero and of suf- the related problem of detectability of adversarial pertur-
ficiently small radius. Unfortunately, this optimization prob- bations. The detectability property is essential in real-
lem in (9) is difficult to solve efficiently. To circumvent this world applications, as it allows the possibility to raise an
difficulty, [32] proposes an alternating iterative method where exception when tampered images are detected. In [42], the
a single step of gradient ascent and descent is performed at authors propose to augment the network with a detector net-
each iteration. Note that the construction of robust classi- work, which detects original images from perturbed ones.
fiers using min-max robust optimization methods has been Using the optimization methods in the section “Adversarial
an active area of research, especially in the context of SVM Perturbations,” the authors conclude that the network suc-
classifiers [33]. In particular, for certain sets U, the objective cessfully learns to distinguish between perturbed samples
function of various learning tasks can be written as a convex and original samples. Moreover, the overall network (i.e.,
optimization function as shown in [34]–[37], which makes the network and detector) is shown to be more robust to
the task of finding a robust classifier feasible. In a very recent adversarial perturbations tailored for this architecture. In
work inspired by biophysical principles of neural circuits, [43], the Bayesian uncertainty estimates in the subspace of
Nayebi and Ganguli consider a regularizer to push activations learned representations are used to discriminate perturbed
of the network in the saturating regime of the nonlinearity images from clean samples. Finally, as shown in [44], side

60 IEEE Signal Processing Magazine | November 2017 |


information such as depth maps can be exploited to detect The importance of analyzing the vulnerability of deep neural
adversarial samples. networks to perturbations therefore goes beyond the practi-
cal security implications, as it further reveals crucial geo-
Improving the robustness to geometric perturbations metric properties of deep networks. We hope that this close
Just as in the case of adversarial perturbations, one popular relation between robustness and geometry will continue to be
way of building more invariant representations to geomet- leveraged to design more robust systems.
ric perturbations is through virtual jittering (or data aug- Despite the recent and insightful ad­­vances in the analysis
mentation), where training data are transformed and fed of the vulnerability of deep neural networks, several chal-
back to the training set. One of the drawbacks of this approach lenges remain:
is, however, that the training can become intractable, as the ■■ It is known that deep networks are vulnerable to universal
size of the training set becomes substantially larger than perturbations due to the existence of correlations between
the original data set. In another effort to improve the invari- different parts of the decision boundary. Yet, little is
ance properties of deep CNNs, the authors in [45] proposed known about the elementary operations in the architecture
a new module, the spatial transformer, that (or learned weights) of a deep network
geometrically transforms the filter maps. that cause the classifier to be sensitive to
Similarly to other modules in the network, One of the main strengths such directions.
spatial transformer modules are trained in of deep neural network ■■ S imilarly, the causes underlying the
a purely supervised fashion. Using spatial classifiers with respect transferability of adversarial perturba-
transformer networks, the performance of tions across different architectures are
classifiers improves significantly, especial-
to traditional shallow still not understood formally.
ly when images have noise and clutter, as classifiers is that the ■■ While the classifier’s decision boundary
these modules automatically learn to local- former achieve higher has been shown to have a very small
ize and unwarp corrupted images. To build levels of invariance to curvature when sectioned by random
robust deep representations, [46] considers transformations. normal planes, it is still unclear whether
instead a new architecture with fixed filter this property of the decision boundary
weights. Specifically, a similar structure is due to the optimization method (i.e.,
to CNNs (i.e., cascade of filtering, nonlinearity, and pool- stochastic gradient descent) or rather to the use of piece-
ing operations) is considered with the additional require- wise linear activation functions.
ment of stability of the representation to local deformations, ■■ While natural images have been shown to lie very close to
while retaining maximum information about the original the decision boundary, it is still unclear whether there exist
data. The scattering network is proposed, where succes- points that lie far away from the decision boundary.
sive filtering with wavelets and pointwise nonlineari- Finally, one of the main goals of the analysis of robustness
ties is applied and further shown to satisfy the stability is to propose architectures with increased robustness to addi-
constraints. Note that the approach used to build this scat- tive and structured perturbations. This is probably one of the
tering network significantly differs from traditional CNNs, fundamental problems that needs special attention from the
as no learning of the filters is involved. It should further community in the years to come.
be noted that while scattering transforms guarantee that
representations built by deep neural networks are robust Authors
to small changes in the input, this does not imply that the Alhussein Fawzi (fawzi@cs.ucla.edu) received the M.S. and
overall classification pipeline (feature representation and Ph.D. degrees in electrical engineering from the Swiss Federal
discrete classification) is robust to small perturbations in Institute of Technology, Lausanne, in 2012 and 2016, respec-
the input, in the sense of (2). We believe that building deep tively. He is now a postdoctoral researcher in the Computer
architectures with provable guarantees on the robustness Science Department at the University of California, Los
of the overall classification function is a fundamental open Angeles. He received the IBM Ph.D. fellowship in 2013 and
problem in the area. 2015. His research interests include signal processing, machine
learning, and computer vision.
Summary and open problems Seyed-Mohsen Moosavi-Dezfooli (seyed.moosavi@epfl
The robustness of deep neural networks to perturbations is a .ch) received the B.S. degree in electrical engineering from
fundamental requirement in a large number of practical appli- Amirkabir University of Technology (Tehran Polytechnic),
cations involving critical prediction problems. We discussed Iran, in 2012 and the M.S. degree in communication systems
in this article the robustness of deep networks to different from the École Polytechnique Fédérale de Lausanne (EPFL),
forms of perturbations: adversarial perturbations, random Switzerland, in 2014. Currently, he is a Ph.D. degree student
noise, universal perturbations, and geometric transforma- in the Signal Processing Laboratory 4 at EPFL under the
tions. We further highlighted close connections between the supervision of Prof. Pascal Frossard. Previously, he was a
robustness to additive perturbations and geometric properties research assistant in the Audiovisual Communications
of the classifier’s decision boundary (such as the curvature). Laboratory at EPFL. During the spring and the summer of

IEEE Signal Processing Magazine | November 2017 | 61


2014, he was a research intern with ABB Corporate Research, [19] H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio, “An
empirical evaluation of deep architectures on problems with many factors of varia-
Baden-Daettwil. His research interests include signal process- tion,” in ACM Int. Conf. Machine Learning, 2007, pp. 473–480.
ing, machine learning, and computer vision. [20] M. Sharif, S. Bhagavatula, L. Bauer, and M. K. Reiter, “Accessorize to a crime:
Pascal Frossard (pascal.frossard@epfl.ch) received Real and stealthy attacks on state-of-the-art face recognition,” in Proc. 2016 ACM
SIGSAC Conf. Computer and Communications Security, 2016, pp. 1528–1540.
the M.S. and Ph.D. degrees in electrical engineering [21] N Karianakis, J Dong, and S Soatto, “An empirical evaluation of current convolu-
from the École Polytechnique Fédérale de Lausanne (EPFL), tional architectures ability to manage nuisance location and scale variability,” in Proc.
IEEE Conf. Computer Vision and Pattern Recognition, 2016, pp. 4442–4451.
Switzerland, in 1997 and 2000, respectively. From 2001 to
[22] S.-M. Moosavi-Dezfooli, A Fawzi, O Fawzi, and P Frossard, “Universal adversar-
2003, he was a member of the research staff with the IBM T.J. ial perturbations,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition,
Watson Research Center, Yorktown Heights, New York, where 2017.
he was involved in media coding and streaming technologies. [23] Y. Liu, X. Chen, C. Liu, and D. Song, “Delving into transferable adversarial exam-
ples and black-box attacks,” arXiv Preprint, arXiv:1611.02770, 2016.
Since 2003, he has been a faculty member at EPFL, where he
[24] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. Berkay Celik, and A. Swami,
is currently the head of the Signal Processing Laboratory. His “Practical black-box attacks against deep learning systems using adversarial examples,”
research interests include signal processing on graphs and net- arXiv Preprint, arXiv:1602.02697, 2016.
[25] D. Warde-Farley, I. Goodfellow, T. Hazan, G. Papandreou, and D. Tarlow,
works, image representation and coding, visual information “Adversarial perturbations of deep neural networks,” in Perturbations, Optimization,
analysis, and machine learning. and Statistics. Cambridge, MA: MIT Press, 2016.
[26] S. Sabour, Y. Cao, F. Faghri, and D. J. Fleet, “Adversarial manipulation of deep
representations,” in Proc. Int. Conf. Learning Representations, 2016.
References
[1] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and [27] T. Tanay and L. Griffin, “A boundary tilting persepective on the phenomenon of
R. Fergus, “Intriguing properties of neural networks,” in Proc. Int. Conf. adversarial examples,” arXiv Preprint, arXiv:1608.07690, 2016.
Learning Representations, 2014. [28] A. Fawzi, S.-M. Moosavi-Dezfooli, P. Frossard, and S. Soatto, “Classification
[2] A Krizhevsky, I Sutskever, and G E Hinton, “Imagenet classification with regions of deep neural networks,” arXiv Preprint, arXiv:1705.09552, 2017.
deep convolutional neural networks,” in Proc. Advances in Neural Information [29] B. Poole, S. Lahiri, M. Raghu, J. Sohl-Dickstein, and S. Ganguli, “Exponential
Processing Systems, 2012, pp. 1097–1105. expressivity in deep neural networks through transient chaos,” in Proc. Advances in
[3] A Nguyen, J Yosinski, and J Clune, “Deep neural networks are easily fooled: Neural Information Processing Systems Conf., 2016, pp. 3360–3368.
High confidence predictions for unrecognizable images,” in Proc. IEEE Conf. [30] S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, P. Frossard, and S. Soatto,
Computer Vision and Pattern Recognition, 2015, pp. 427–436. “Analysis of universal adversarial perturbations,” arXiv Preprint, arXiv:1705.09554,
[4] G. Litjens, T. Kooi, B. Ehteshami Bejnordi, A. A. A. Setio, F. Ciompi, M. 2017.
Ghafoorian, J. A. W. M. van der Laak, B. van Ginneken and C. I. Sánchez, “A [31] S. Gu and L. Rigazio, “Towards deep neural network architectures robust to adver-
survey on deep learning in medical image analysis,” Med. Image Anal., vol. 42, sarial examples,” arXiv Preprint, arXiv:1412.5068, 2014.
pp. 60–88, 2017.
[32] U. Shaham, Y. Yamada, and S. Negahban, “Understanding adversarial training:
[5] S-M Moosavi-Dezfooli, A Fawzi, and P Frossard, “Deepfool: A simple and Increasing local stability of neural nets through robust optimization,” arXiv Preprint,
accurate method to fool deep neural networks,” in Proc. IEEE Conf. Computer arXiv:1511.05432, 2015.
Vision and Pattern Recognition, 2016, pp. 2574–2582.
[33] C. Caramanis, S. Mannor, and H. Xu, “Robust optimization in machine learn-
[6] A. Fawzi and P. Frossard, “Manitest: Are classifiers really invariant?” in Proc. ing,” in Optimization for Machine Learning, S. Suvrit, N. Sebastian, and W. J.
British Machine Vision Conf., 2015, pp. 106.1–106.13. Stephen, Eds., Cambridge, MA: MIT Press, ch. 14, 2012.
[7] A. Fawzi and P. Frossard, “Measuring the effect of nuisance variables on clas- [34] H. Xu, C. Caramanis, and S. Mannor, “Robustness and regularization of support
sifiers,” in Proc. British Machine Vision Conf., 2016, pp. 137.1–137.12. vector machines,” J. Machine Learning Res., vol. 10, pp. 1485–1510, July 2009.
[8] A Fawzi, H Samulowitz, D Turaga, and P Frossard, “Adaptive data augmenta- [35] G. Lanckriet, L. E. Ghaoui, C. Bhattacharyya, and M. I. Jordan, “A robust mini-
tion for image classification,” in Proc. Int. Conf. Image Processing, 2016, pp. max approach to classification,” J. Machine Learning Res., vol. 3, pp. 555–582, Dec.
3688–3692. 2003.
[9] B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. [36] C Bhattacharyya, “Robust classification of noisy data using second order cone
Giacinto, and F. Roli, “Evasion attacks against machine learning at test time,” in programming approach,” in Proc. Intelligent Sensing and Information Processing
Proc. Joint European Conf. Machine Learning and Knowledge Discovery in Conf., 2004, pp. 433–438.
Databases, 2013, pp. 387–402.
[37] T. B. Trafalis and R. C. Gilbert, “Robust support vector machines for classification
[10] A. Fawzi, O. Fawzi, and P. Frossard, “Analysis of classifiers’ robustness to and computational issues,” Optim. Methods Software, vol. 22, no. 1, pp. 187–198, 2007.
adversarial perturbations,” Machine Learning, Aug. 2017. [Online]. Available:
https://doi.org/10.1007/s10994-017-5663-3 [38] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami, “Distillation as a defense
to adversarial perturbations against deep neural networks,” in Proc. 2016 IEEE Symp.
[11] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing Security and Privacy, 2016, pp. 582–597.
adversarial examples,” in Proc. Int. Conf. Learning Representations, 2015.
[39] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural net-
[12] A. Rozsa, E. M. Rudd, and T. E. Boult, “Adversarial diversity and hard posi- work,” arXiv Preprint, arXiv:1503.02531, 2015.
tive generation,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition
Workshops, 2016, pp. 25–32. [40] N. Carlini and D. Wagner, “Defensive distillation is not robust to adversarial
examples,” arXiv Preprint, arXiv:1607.04311, 2016.
[13] N. Carlini and D. Wagner, “Towards evaluating the robustness of neural net-
works,” arXiv Preprint, arXiv:1608.04644, 2016. [41] A. A. Alemi, I. Fischer, J. V. Dillon, and K. Murphy, “Deep variational informa-
tion bottleneck,” arXiv Preprint, arXiv:1612.00410, 2016.
[14] S. Baluja and I. Fischer, “Adversarial transformation networks: Learning to
generate adversarial examples,” arXiv Preprint, arXiv:1703.09387, 2017. [42] J. H. Metzen, T. Genewein, V. Fischer, and B. Bischoff, “On detecting adversarial
perturbations,” arXiv Preprint, arXiv:1702.04267, 2017.
[15] Y Jia, E Shelhamer, J Donahue, S Karayev, J Long, R Girshick, S
Guadarrama, and T Darrell, “Caffe: Convolutional architecture for fast feature [43] R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner, “Detecting adversarial
embedding,” in Proc. ACM Int. Conf. Multimedia, 2014, pp. 675–678. samples from artifacts,” arXiv Preprint, arXiv:1703.00410, 2017.
[16] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. [44] J. Lu, T. Issaranon, and D. Forsyth, “Safetynet: Detecting and rejecting adversarial
Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. IEEE examples robustly,” arXiv Preprint, arXiv:1704.00103, 2017.
Conf. Computer Vision and Pattern Recognition, 2015. [45] M. Jaderberg, K. Simonyan, and A. Zisserman, “Spatial transformer networks,” in
[17] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Proc. Advances in Neural Information Processing Systems Conf., 2015, pp. 2017–2025.
Karpathy, A. Khosla, M. Bernstein, A. Berg, and L. Fei-Fei, “Imagenet large [46] A. Nayebi and S. Ganguli, “Biologically inspired protection of deep networks
scale visual recognition challenge,” Int. J. Computer Vision, vol. 115, no. 3, pp. from adversarial attacks,” arXiv Preprint, arXiv:1703.09202, 2017.
211–252, 2015. [47] M. Cisse, A Courville, P. Bojanowski, and E. Grave, Y. Dauphin, and N. Usunier
[18] A Fawzi, S. Moosavi-Dezfooli, and P. Frossard, “Robustness of classifiers: “Parseval networks: Improving robustness to adversarial examples,” in Proc. Int. Conf.
from adversarial to random noise,” in Proc. Neural Information Processing Machine Learning, 2017, pp. 854–863.
Systems Conf., 2016, pp. 1632–1640. 
SP

62 IEEE Signal Processing Magazine | November 2017 |

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy