0% found this document useful (0 votes)
67 views11 pages

Steganogan: High Capacity Image Steganography With Gans

Uploaded by

刘邓
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views11 pages

Steganogan: High Capacity Image Steganography With Gans

Uploaded by

刘邓
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

SteganoGAN: High Capacity Image Steganography with GANs

Kevin A. Zhang,1 Alfredo Cuesta-Infante,2 Lei Xu,1 Kalyan Veeramachaneni1


1
MIT, Cambridge, MA - 02139, USA
kevz,leix,kalyanv@mit.edu
2
Univ. Rey Juan Carlos, Spain
alfredo.cuesta@urjc.es
arXiv:1901.03892v2 [cs.CV] 30 Jan 2019

Abstract identify spatial locations suitable for embedding data), or as


an end-to-end solution, which takes in a cover image and
Image steganography is a procedure for hiding
a secret message and combines them into a steganographic
messages inside pictures. While other techniques
image.
such as cryptography aim to prevent adversaries
from reading the secret message, steganography These attempts have proved that deep learning can be used
aims to hide the presence of the message itself. for practical end-to-end image steganography, and have
In this paper, we propose a novel technique for achieved embedding rates competitive with those accom-
hiding arbitrary binary data in images using gen- plished through traditional techniques (Pevný et al., 2010).
erative adversarial networks which allow us to However, they are also more limited than their traditional
optimize the perceptual quality of the images pro- counterparts: they often impose special constraints on the
duced by our model. We show that our approach size of the cover image (for example, (Hayes & Danezis,
achieves state-of-the-art payloads of 4.4 bits per 2017) requires the cover images to be 32 x 32); they attempt
pixel, evades detection by steganalysis tools, and to embed images inside images and not arbitrary messages
is effective on images from multiple datasets. To or bit vectors; and finally, they do not explore the limits
enable fair comparisons, we have released an open of how much information can be hidden successfully. We
source library that is available online at: https: provide the reader a detailed analysis of these methods in
//github.com/DAI-Lab/SteganoGAN. Section 7.
To address these limitations, we propose STEGANOGAN,
a novel end-to-end model for image steganography that
1. Introduction builds on recent advances in deep learning. We use dense
The goal of image steganography is to hide a secret message connections which mitigate the vanishing gradient prob-
inside an image. In a typical scenario, the sender hides a lem and have been shown to improve performance (Huang
secret message inside a cover image and transmits it to the et al., 2017). In addition, we use multiple loss functions
receiver, who recovers the message. Even if the image is within an adversarial training framework to optimize our
intercepted, no one besides the sender and receiver should encoder, decoder, and critic networks simultaneously. We
be able to detect the presence of a message. find that our approach successfully embeds arbitrary data
into cover images drawn from a variety of natural scenes
Traditional approaches to image steganography are only and achieves state-of-the-art embedding rates of 4.4 bits
effective up to a relative payload of around 0.4 bits per per pixel while evading standard detection tools. Figure 1
pixel (Pevný et al., 2010). Beyond that point, they tend presents some example images that demonstrate the effec-
to introduce artifacts that can be easily detected by auto- tiveness of STEGANOGAN. The left-most figure is the origi-
mated steganalysis tools and, in extreme cases, by the hu- nal cover image without any secret messages. The next four
man eye. With the advent of deep learning in the past decade, figures contain approximately 1, 2, 3, and 4 bits per pixel
a new class of image steganography approaches is emerging worth of secret data, respectively, without producing any
(Hayes & Danezis, 2017; Baluja, 2017; Zhu et al., 2018). visible artifacts.
These approaches use neural networks as either a compo-
nent in a traditional algorithm (e.g. using deep learning to Our contributions through this paper are:
Correspondence to: Kevin A. Zhang <kevz@mit.edu>. – We present a novel approach that uses adversarial train-
ing to solve the steganography task and achieves a rel-
Preprint. ative payload of 4.4 bits per pixel which is 10x higher
SteganoGAN

Figure 1. A randomly selected cover image (left) and the corresponding steganographic images generated by STEGANOGAN at approxi-
mately 1, 2, 3, and 4 bits per pixel.

than competing deep learning-based approaches with secret message between two actors. First, the information
similar peak signal to noise ratios. contained in a cryptogram is accessible to anyone who has
the private key, which poses a challenge in countries where
– We propose a new metric for evaluating the capacity of
private key disclosure is required by law. Furthermore, the
deep learning-based steganography algorithms, which
very existence of a cryptogram reveals the presence of a
enables comparisons against traditional approaches.
message, which can invite attackers. These problems with
– We evaluate our approach by measuring its ability to plain cryptography exist in security, intelligence services,
evade traditional steganalysis tools which are designed and a variety of other disciplines (Conway, 2003).
to detect whether an image is steganographic or not.
For many of these fields, steganography offers a promis-
Even when we encode > 4 bits per pixel into the image,
ing alternative. For example, in medicine, steganography
most traditional steganalysis tools still only achieve a
can be used to hide private patient information in images
detection auROC of < 0.6.
such as X-rays or MRIs (Srinivasan et al., 2004) as well as
– We also evaluate our approach by measuring its ability biometric data (Douglas et al., 2018). In the media sphere,
to evade deep learning-based steganalysis tools. We steganography can be used to embed copyright data (Mah-
train a state-of-the-art model for automatic steganalysis eswari & Hemanth, 2015) and allow content access control
proposed by (Ye et al., 2017) on samples generated systems to store and distribute digital works over the Inter-
by our model. If we require our model to produce net (Kawaguchi et al., 2007). In each of these situations, it
steganographic images such that the detection rate is at is important to embed as much information as possible, and
most 0.8 auROC, we find that our model can still hide for that information to be both undetectable and lossless
up to 2 bits per pixel. to ensure the data can be recovered by the recipient. Most
work in the area of steganography, including the methods
– We are releasing a fully-maintained open-source li-
described in this paper, targets these two goals. We propose
brary called STEGANOGAN1 , including datasets and
a new class of models for image steganography that achieves
pre-trained models, which will be used to evaluate deep
both these goals.
learning based steganography techniques.

The rest of the paper is organized as follows. Section 2 3. SteganoGAN


briefly describes our motivation for building a better image
In this section, we introduce our notation, present the model
steganography system. Section 3 presents STEGANOGAN
architecture, and describe the training process. At a high
and describes our model architecture. Section 4 describes
level, steganography requires just two operations: encoding
our metrics for evaluating model performance. Section 5
and decoding. The encoding operation takes a cover image
contains our experiments for several variants of our model.
and a binary message, and creates a steganographic image.
Section 6 explores the effectiveness of our model at avoiding
The decoding operation takes the steganographic image and
detection by automated steganalysis tools. Section 7 details
recovers the binary message.
related work in the generation of steganographic images.

3.1. Notation
2. Motivation
We have C and S as the cover image and the steganographic
There are several reasons to use steganography instead of image respectively, both of which are RGB color images and
(or in addition to) cryptography when communicating a have the same resolution W × H; let M ∈ {0, 1}D×W ×H
1
https://github.com/DAI-Lab/SteganoGAN be the binary message that is to be hidden in C. Note that D
SteganoGAN

is the upper-bound on the relative payload; the actual relative applying the following two operations:
payload is the number of bits that can reliably decoded
which is given by (1 − 2p)D, where p ∈ [0, 1] is the error
rate. The actual relative payload is discussed in more detail 1. Processing the cover image C with a convolutional
in Section 4. block to obtain the tensor a given by

The cover image C is sampled from the probability distribu- a = Conv3→32 (C) (1)
tion of all natural images PC . The steganographic image S
is then generated by a learned encoder E(C, M ). The secret
message M̂ is then extracted by a learned decoder D(S). 2. Concatenating the message M to a and then process-
The optimization task, given a fixed message distribution, ing the result with a convolutional block to obtain the
is to train the encoder E and the decoder D to minimize tensor b:
(1) the decoding error rate p and (2) the distance between
natural and steganographic image distributions dis(PC , PS ). b = Conv32+D→32 (Cat(a, M )) (2)
Therefore, to optimize the encoder and the decoder, we also
need to train a critic network C(·) to estimate dis(PC , PS ).
0
Basic: We sequentially apply two convolutional blocks to
Let X ∈ RD×W ×H and Y ∈ RD ×W ×H be two tensors of tensor b and generate the steganographic image as shown in
the same width and height but potentially different depth, D Figure 2b. Formally:
0
and D0 ; then, let Cat : (X, Y ) → Φ ∈ R(D+D )×W ×H be
the concatenation of the two tensors along the depth axis. Eb (C, M ) = Conv32→3 (Conv32→32 (b)), (3)
0
Let ConvD→D0 : X ∈ RD×W ×H → Φ ∈ RD ×W ×H be
a convolutional block that maps an input tensor X into a This approach is similar to that in (Baluja, 2017) as the
feature map Φ of the same width and height but potentially steganographic image is simply the output of the last convo-
different depth. This convolutional block consists of a con- lutional block.
volutional layer with kernel size 3, stride 1 and padding
Residual: The use of residual connections has been shown
‘same’, followed by a leaky ReLU activation function and
to improve model stability and convergence (He et al., 2016)
batch normalization. The activation function and batch nor-
so we hypothesize that its use will improve the quality of
malization operations are omitted if the convolutional block
the steganographic image. To this end we modify the basic
is the last block in the network.
encoder by adding the cover image C to its output so that
Let Mean : X ∈ RD×W ×H → RD represent the adaptive the encoder learns to produce a residual image as shown in
mean spatial pooling operation which computes the average Figure 2c. Formally,
of the W × H values in each feature map of tensor X.
Er (C, M ) = C + Eb (C, M ), (4)
3.2. Architecture
In this paper, we present STEGANOGAN, a generative adver- Dense: In the dense variant, we introduce additional connec-
sarial network for hiding an arbitrary bit vector in a cover tions between the convolutional blocks so that the feature
image. Our proposed architecture, shown in Figure 2, con- maps generated by the earlier blocks are concatenated to the
sists of three modules: (1) an Encoder that takes a cover feature maps generated by later blocks as shown in Figure
image and a data tensor, or message, and produces a stegano- 2d. This connectivity pattern is inspired by the DenseNet
graphic image (Section 3.2.1); (2) a Decoder that takes the (Huang et al., 2017) architecture which has been shown to
steganographic image and attempts to recover the data ten- encourage feature reuse and mitigate the vanishing gradient
sor (Section 3.2.2), and (3) a Critic that evaluates the quality problem. Therefore, we hypothesize that the use of dense
of the cover and steganographic images (Section 3.2.3). connections will improve the embedding rate. It can be
formally expressed as follows
3.2.1. E NCODER

The encoder network takes a cover image C and a message  c = Conv64+D→32 (Cat(a, b, M ))
M ∈ {0, 1}D×W ×H . Hence M is a binary data tensor of d = Conv96+D→3 (Cat(a, b, c, M )) (5)
Ed (C, M ) = C + d

shape D × W × H where D is the number of bits that we
will attempt to hide in each pixel of the cover image.
We explore three variants of the encoder architecture with Finally, the output of each variant is a steganographic image
different connectivity patterns. All the variants start by S = E{b,r,d} (C, M ) that has the same resolution and depth
than the cover image C.
SteganoGAN

Encoder Decoder

Image (32, W, H) Data


(3, W, H) (D, W, H)

(3, W, H)

Score

Data (D, W, H) (32, W, H)

Critic

(a)

Image (32, W, H) (32, W, H) + Image (32, W, H) +


Image
(3, W, H) (3, W, H) (3, W, H)

(32, W, H) (32, W, H) (3, W, H) (32, W, H) (32, W, H) (3, W, H) (32, W, H) (32, W, H) (3, W, H)

Data Data Data


(D, W, H) (D, W, H) (D, W, H)

(b) (c) (d)

Figure 2. (a) The model architecture with the Encoder, Decoder, and Critic. The blank rectangle representing the Encoder can be any of
the following: (b) Basic encoder, (c) Residual encoder and (d) Dense encoder. The trapezoids represent convolutional blocks, two or more
arrows merging represent concatenation operations, and the curly bracket represents a batching operation.

3.2.2. D ECODER mean pooling to the output of the convolutional layer.


The decoder network takes the steganographic image S
produced by the encoder. Formally it can be expressed as: 
a = Conv32→32 (Conv32→32 (Conv3→32 (S)))
(7)
C(S) = Mean(Conv32→1 ((a))


 a = Conv3→32 (S) 3.3. Training
b = Conv32→32 (a)

(6) We iteratively optimize the encoder-decoder network and the
c = Conv64→32 (Cat(a, b))
critic network. To optimize the encoder-decoder network,


D(S) = Conv96→D (Cat(a, b, c))

we jointly optimize three losses: (1) the decoding accuracy
using the cross-entropy loss
Ld = EX∼PC CrossEntropy(D(E(X, M )), M ) (8)
The decoder produces M̂ = Dd (S); in other words it at-
tempts to recover the data tensor M . (2) the similarity between steganographic image and the
cover image using mean square error
3.2.3. C RITIC
1
Ls = EX∼PC ||X − E(X, M )||22 (9)
To provide feedback on the performance of our encoder and 3×W ×H
generate more realistic images, we introduce an adversarial (3) and the realness of the steganographic image using the
Critic. The critic network consists of three convolutional critic network
blocks followed by a convolutional layer with one output
channel. To produce the scalar score, we apply adaptive Lr = EX∼PC C(E(X, M )) (10)
SteganoGAN

The training objective is to to be less than or equal to the number of bits we can correct:

minimize Ld + Ls + Lr . (11) n−k


p·n≤ (13)
2
To train the critic network, we minimize the Wasserstein
loss The ratio k/n represents the average number of bits of ”real”
data we can transmit for each bit of ”message” data; then,
Lc =EX∼PC C(X) from (13), it follows that the ratio is less than or equal to
1 − 2p. As a result, we can measure the relative payload of
− EX∼PC C(E(X, M )) (12) our steganographic technique by multiplying the number of
bits we attempt to hide in each pixel by the ratio to obtain
During every iteration, we match each cover image C with the ”real” number of bits that is transmitted and recovered.
a data tensor M , which consists of a randomly generated We refer to this metric as Reed-Solomon bits-per-pixel (RS-
sequence of D×W ×H bits sampled from a Bernoulli distri- BPP), and note that it can be directly compared against
bution M ∼ Ber(0.5). In addition, we apply standard data traditional steganographic techniques since it represents the
augmentation procedures including horizontal flipping and average number of bits that can be reliably transmitted in an
random cropping to cover image C in our pre-processing image divided by the size of the image.
pipeline. We use the Adam optimizer with learning rate
1e−4 , clip our gradient norm to 0.25, clip the critic weights Peak Signal to Noise Ratio: In addition to measuring the
to [−0.1, 0.1], and train for 32 epochs. relative payload, we also need to measure the quality of the
steganographic image. One widely-used metric for measur-
ing image quality is the peak signal-to-noise ratio (PSNR).
4. Evaluation Metrics This metric is designed to measure image distortions and
Steganography algorithms are evaluated along three axes: has been shown to be correlated with mean opinion scores
the amount of data that can be hidden in an image, a.k.a ca- produced by human experts (Wang et al., 2004).
pacity, the similarity between the cover and steganography Given two images X and Y of size (W , H) and a scaling
image, a.k.a distortion, and the ability to avoid detection factor sc which represents the maximum possible difference
by steganalysis tools, a.k.a secrecy. This section describes in the numerical representation of each pixel2 , the PSNR is
some metrics for evaluating the performance of our model defined as a function of the mean squared error (MSE):
along these axes.
Reed Solomon Bits Per Pixel: Measuring the effective
W H
number of bits that can be conveyed per pixel is non-trivial 1 XX
in our setup since the ability to recover a hidden bit is heavily MSE = (Xi,j − Yi,j )2 , (14)
W H i=1 j=1
dependent on the model and the cover image, as well as the
message itself. PSNR = 20 · log10 (sc) − 10 · log10 (MSE) (15)
To model this situation, suppose that a given model incor-
rectly decodes a bit with probability p. It is tempting to Although PSNR is widely used to evaluate the distortion
just multiply the number of bits in the data tensor by the produced by steganography algorithms, (Almohammad &
accuracy 1 − p and report that value as the relative payload. Ghinea, 2010) suggests that it may not be ideal for com-
Unfortunately, that value is actually meaningless – it allows parisons across different types of steganography algorithms.
you to estimate the number of bits that have been correctly Therefore, we introduce another metric to help us evaluate
decoded, but does not provide a mechanism for recovering image quality: the structural similarity index.
from errors or even identifying which bits are correct. Structural Similarity Index: In our experiments, we also
Therefore, to get an accurate estimate of the relative payload report the structural similarity index (SSIM) between the
of our technique, we turn to Reed-Solomon codes. Reed- cover image and the steganographic image. SSIM is widely
Solomon error-correcting codes are a subset of linear block used in the broadcast industry to measure image and video
codes which offer the following guarantee: Given a message quality (Wang et al., 2004). Given two images X and Y ,
of length k, the code can generate a message of length n the SSIM can be computed using the means, µX and µY ,
where n ≥ k such that it can recover from n−k 2 errors (Reed 2
For example, if the images are represented as floating point
& Solomon, 1960). This implies that given a steganography numbers in [−1.0, 1.0], then sc = 2.0 since the maximum differ-
algorithm which, on average, returns an incorrect bit with ence between two pixels is achieved when one is 1.0 and the other
probability p, we would want the number of incorrect bits is −1.0.
SteganoGAN

Figure 3. Randomly selected pairs of cover (left) and steganographic (right) images from the COCO dataset which embeds random binary
data at the maximum payload of 4.4 bits-per-pixel.

2
variances, σX and σY2 , and covariance σXY
2
of the images coding scheme discussed in Section 4 to produce our bits-
as shown below: per-pixel metric, shown in Table 1 under RS-BPP. We pub-
licly released the pre-trained models for all the experiments
shown in this table on AWS S33 .
(2µX µY + k1 R)(2σXY + k2 R)
SSIM = (16) The results from our experiments are shown in Table 1 –
(µ2X+ µ2Y + k1 R)(σX
2 + σ 2 + k R)
Y 2 each of the metrics is computed on a held-out test set of
images that is not shown to the model during training. Note
The default configuration for SSIM uses k1 = 0.01 and that there is an unavoidable tradeoff between the relative
k2 = 0.03 and returns values in the range [−1.0, 1.0] where payload and the image quality measures; assuming we are
1.0 indicates the images are identical. already on the Pareto frontier, an increased relative payload
would inevitably result in a decreased similarity.
5. Results and Analysis We immediately observe that all variants of our model per-
form better on the COCO dataset than the Div2K dataset.
We use the Div2k (Agustsson & Timofte, 2017) and COCO
This can be attributed to differences in the type of content
(Lin et al., 2014) datasets to train and evaluate our model.
photographed in the two datasets. Images from the Div2K
We experiment with each of the three model variants dis-
dataset tend to contain open scenery, while images from
cussed in Section 3 and train them with 6 different data
the COCO dataset tend to be more cluttered and contain
depths D ∈ {1, 2, ..., 6}. The data depth D represents the
multiple objects, providing more surfaces and textures for
“target” bits per pixel so the randomly generated data tensor
our model to successfully embed data.
has shape D x W x H.
In addition, we note that our dense variant shows the best
We use the default train/test split proposed by the creators
performance on both relative payload and image quality,
of the Div2K and COCO data sets in our experiments, and
followed closely by the residual variant which shows com-
we report the average RS-BPP, PSNR, and SSIM on the
parable image quality but a lower relative payload. The
test set in Table 1. Our models are trained on GeForce GTX
basic variant offers the worst performance across all metrics,
1080 GPUs. The wall clock time per epoch is approximately
achieving relative payloads and image quality scores that
10 minutes for Div2K and 2 hours for COCO.
are 15-25% lower than the dense variant.
After training our model, we compute the expected accuracy 3
http://steganogan.s3.amazonaws.com/
on a held-out test set and adjust it using the Reed-Solomon
SteganoGAN

Accuracy RS-BPP PSNR SSIM


Dataset D
Basic Resid. Dense Basic Resid Dense Basic Resid. Dense Basic Resid. Dense
1 0.95 0.99 1.00 0.91 0.99 0.99 24.52 41.68 41.60 0.70 0.96 0.95
2 0.91 0.98 0.99 1.65 1.92 1.96 24.62 38.25 39.62 0.67 0.90 0.92
3 0.82 0.92 0.94 1.92 2.52 2.63 25.03 36.67 36.52 0.69 0.85 0.85
Div2K
4 0.75 0.82 0.82 1.98 2.52 2.53 24.45 37.86 37.49 0.69 0.88 0.88
5 0.69 0.74 0.75 1.86 2.39 2.50 24.90 39.45 38.65 0.70 0.90 0.90
6 0.67 0.69 0.70 2.04 2.32 2.44 24.72 39.53 38.94 0.70 0.91 0.90
1 0.98 0.99 0.99 0.96 0.99 0.99 31.21 41.71 42.09 0.87 0.98 0.98
2 0.97 0.99 0.99 1.88 1.97 1.97 31.56 39.00 39.08 0.86 0.96 0.95
3 0.94 0.97 0.98 2.67 2.85 2.87 30.16 37.38 36.93 0.83 0.93 0.92
COCO
4 0.87 0.95 0.95 2.99 3.60 3.61 31.12 36.98 36.94 0.83 0.92 0.92
5 0.84 0.90 0.92 3.43 3.99 4.24 29.73 36.69 36.61 0.80 0.90 0.91
6 0.78 0.84 0.87 3.34 4.07 4.40 31.42 36.75 36.33 0.84 0.89 0.88

Table 1. The relative payload and image quality metrics for each dataset and model variant. The Dense model variant offers the best
performance across all metrics in almost all experiments.

6.1. Statistical Steganalysis


We use a popular open-source steganalysis tool called
StegExpose (Boehm, 2014) which combines several ex-
isting steganalysis techniques including Sample Pairs (Du-
mitrescu et al., 2003), RS Analysis (Fridrich et al., 2001),
Chi Squared Attack (Westfeld & Pfitzmann, 2000), and
Primary Sets (Dumitrescu et al., 2002). To measure the
effectiveness of our method at evading detection by these
techniques, we randomly select 1,000 cover images from
the test set, generating the corresponding steganographic
images using our Dense architecture with data depth 6, and
examine the results using StegExpose.
The receiver operating characteristic curve for our Dense
Figure 4. A randomly selected pair of cover (left) and stegano-
graphic (right) images and the differences between them. The top
model is shown in Figure 5 and we note that the StegExpose
row shows the output from a simple least-significant-bit steganog- tool is only slightly more effective than random guessing
raphy algorithm (Johnson & C. Katzenbeisser, 1999) while the with an area under the ROC curve of 0.59, even for payloads
bottom row shows the output from STEGANOGAN with 4.4 bpp. of up to 4.4 bits per pixel. This shows that our model can
Note that STEGANOGAN is able to adapt to the image content. successfully evade standard steganalysis tools, meeting the
minimum requirement for being a viable steganography
algorithm.
Finally, we remark that despite the increased relative pay-
6.2. Neural Steganalysis
load, the image similarity as measured by the average
peak signal to noise ratio between the cover image and Recent studies have shown promising results in detect-
the steganographic images produced by the Dense models ing steganographic images using deep learning based ap-
are comparable to that presented in (Zhu et al., 2018). proaches (Ye et al., 2017). Therefore, we proceed to ex-
amine whether our model can evade deep learning-based
6. Detecting Steganographic Images steganalysis tools. We use the model proposed by Ye et al.,
2017 in (Ye et al., 2017) for steganalysis, with a slight mod-
Steganography techniques are also typically evaluated by ification to enable support of color images, and train it to
their ability to evade detection by steganalysis tools. In detect steganographic images generated by STEGANOGAN.
this section, we experiment with two open source steganaly-
In a typical scenario, the party that is attempting to detect
sis algorithms and measure our model’s ability to generate
steganographic images has access to the algorithm used to
undetectable steganographic images.
SteganoGAN

1
1

0.8 0.9
True Positive Rate

auROC
0.8
0.6

0.7
0.4
0.6

0.2
0.5
1 2 3 4 5 6 7
0 Number
D = 1, RS-BPP of Instances
= 1.0 D = 2, RS-BPP = 2.0
0 0.2 0.4 0.6 0.8 1 D = 3, RS-BPP = 2.9 D = 4, RS-BPP = 3.6
False Positive Rate D = 5, RS-BPP = 4.2 D = 6, RS-BPP = 4.4

Figure 5. The receiver operating characteristic (ROC) curve pro- Figure 6. This plot shows the performance of the steganography
duced by the StegExpose library for a set of 1000 steganographic detector on a held-out test set. The x-axis indicates the number of
images generated using the Dense architecture with a data depth different STEGANOGAN instances that were used, while the y-axis
of 6. The StegExpose library includes multiple steganalysis tools indicates the area under the ROC curve.
including SamplePairs (Dumitrescu et al., 2003), RSAnalysis
(Fridrich et al., 2001), ChiSquaredAttack (Westfeld & Pfitz-
mann, 2000), and PrimarySets (Dumitrescu et al., 2002). The
tool achieves an auROC of 0.59.

create them - in our case, an instance of STEGANOGAN


which is parameterized by the pretrained model weights but the detector performance, as measured by the area under
not the exact model. Using the Dense architecture from the receiver operating characteristic (auROC), increases as
Section 3.2.1 and the COCO dataset, we attempt to replicate we increase the number of bits-per-pixel encoded in the
this scenario with the following experimental setup: image. In addition, we highlight the fact there is no clear
trend in the area under the ROC curve as we increase the
1. We train N instances of the Dense STEGANOGAN ar- number of STEGANOGAN models used for training. This
chitecture with different random seeds. suggests that the external party will have a difficult time
building a model which can detect steganographic images
2. For each of these trained models, we generate a set of generated by STEGANOGAN without knowing the exact
1,000 steganographic images. model parameters.
3. Holding out the images generated by the N th model as Finally, we compare the detection error for images generated
a test set, we train the steganalysis model proposed in by STEGANOGAN against those reported by (Ye et al., 2017)
(Ye et al., 2017) on increasing subsets of the remaining on images generated by three state-of-the-art steganography
images {1}, {1, 2}, {1, 2, 3}, ..., {1, 2, 3, ..., N − 1}. algorithms: WOW (Holub & Fridrich, 2012), S-UNIWARD
(Holub et al., 2014), and HILL (Li et al., 2014). Note that
4. We repeat each experiment 3 times and report the aver-
these techniques are evaluated on different dataset and as
age area under the receiver operating curve in Figure 6.
such, the results are only approximate estimates of the actual
relative payload achievable on a particular dataset. For a
This emulates a realistic setting - the party creating the auto-
fixed detection error rate of 20%, we find that WOW is able
matic detection model will not have access to the specific
to encode up to 0.3 bpp, S-UNIWARD is able to encode
STEGANOGAN model in use, but may have access to the
up to 0.4 bpp, HILL is able to encode up to 0.5 bpp, and
software used to train the models. Therefore, we pose the
STEGANOGAN is able to encode up to 2.0 bpp.
following question: If the external party does not know the
specific model weights but does know the algorithm for
generating models, can they detect steganographic images 7. Related Work
generated by STEGANOGAN?
In this section, we describe a few traditional approaches to
Figure 6 shows the performance of our detector for various image steganography and then discuss recent approaches
relative payloads and training set sizes. First, we note that developed using deep learning.
SteganoGAN

7.1. Traditional Approaches vector to the image vector, and applying feedfoward, reshap-
ing, and convolutional layers. They use the mean squared
A standard algorithm for image steganography is ”Highly
error for the encoder, the cross entropy loss for the discrim-
Undetectable steGO” (HUGO), a cost function-based al-
inator, and the mean squared error for the decoder. They
gorithm which uses handcrafted features to measure the
report that image quality suffers greatly when attempting to
distortion introduced by modifying the pixel value at a par-
increase the number of bits beyond 0.4 bits per pixel.
ticular location in the image. Given a set of N bits to be
embedded, HUGO uses the distortion function to identify The method proposed by (Zhu et al., 2018) uses the same
the top N pixels that can be modified while minimizing the loss functions as (Hayes & Danezis, 2017) but makes
total distortion across the image (Pevný et al., 2010). changes to the model architecture. Specifically, they “repli-
cate the message spatially, and concatenate this message
Another approach is the JSteg algorithm, which is designed
volume to the encoders intermediary representation.” For
specifically for JPEG images. JPEG compression works by
example, in order to hide k bits in an N × N image, they
transforming the image into the frequency domain using
would create a tensor of shape (k, N , N ) where the data
the discrete cosine transform and removing high-frequency
vector is replicated at each spatial location.
components, resulting in a smaller image file size. JSteg
uses the same transformation into the frequency domain, This design allows (Zhu et al., 2018) to handle arbitrary
but modifies the least significant bits of the frequency coef- sized images but cannot effectively scale to higher relative
ficients (Li et al., 2011). payloads. For example, to achieve a relative payload of 1 bit
per pixel in a typical image of size 360 × 480, they would
7.2. Deep Learning for Steganography need to manipulate a data tensor of size (172800, 360, 480).
Therefore, due to the excessive memory requirements, this
Deep learning for image steganography has recently been model architecture cannot effectively scale to handle large
explored in several studies, all showing promising re- relative payloads.
sults. These existing proposals range from training neural
networks to integrate with and improve upon traditional
steganography techniques (Tang et al., 2017) to complete 8. Conclusion
end-to-end convolutional neural networks which use adver- In this paper, we introduced a flexible new approach to im-
sarial training to generate convincing steganographic images age steganography which supports different-sized cover im-
(Hayes & Danezis, 2017; Zhu et al., 2018). ages and arbitrary binary data. Furthermore, we proposed a
Hiding images vs. arbitrary data: The first set of deep new metric for evaluating the performance of deep-learning
learning approaches to steganography were (Baluja, 2017; based steganographic systems so that they can be directly
Wu et al., 2018). Both (Baluja, 2017) and (Wu et al., 2018) compared against traditional steganography algorithms. We
focus solely on taking a secret image and embedding it experiment with three variants of the STEGANOGAN archi-
into a cover image. Because this task is fundamentally tecture and demonstrate that our model achieves higher rela-
different from that of embedding arbitrary data, it is difficult tive payloads than existing approaches while still evading
to compare these results to those achieved by traditional detection.
steganography algorithms in terms of the relative payload.
Natural images such as those used in (Baluja, 2017) and Acknowledgements
(Wu et al., 2018) exhibit strong spatial correlations, and The authors would like to thank Plamen Valentinov Kolev
convolutional neural networks trained to hide images in and Carles Sala for their help with software support and
images would take advantage of this property. Therefore, a developer operations and for the helpful discussions and
model that is trained in such a manner cannot be applied to feedback. Finally, the authors would like to thank Accenture
arbitrary data. for their generous support and funding which made this
Adversarial training: The next set of approaches for image research possible.
steganography are (Hayes & Danezis, 2017; Zhu et al., 2018)
which make use of adversarial training techniques. The key References
differences between these approaches and our approach are
the loss functions used to train the model, the architecture Agustsson, E. and Timofte, R. NTIRE 2017 challenge on
of the model, and how data is presented to the network. single image super-resolution: Dataset and study. In The
IEEE Conf. on Computer Vision and Pattern Recognition
The method proposed by (Hayes & Danezis, 2017) can only (CVPR) Workshops, July 2017.
operate on images of a fixed size. Their approach involves
flattening the image into a vector, concatenating the data
Almohammad, A. and Ghinea, G. Stego image quality
SteganoGAN

and the reliability of psnr. In 2010 2nd International Huang, G., Liu, Z., van der Maaten, L., and Weinberger,
Conference on Image Processing Theory, Tools and Ap- K. Q. Densely connected convolutional networks. IEEE
plications, pp. 215–220, July 2010. doi: 10.1109/IPTA. Conf. on Computer Vision and Pattern Recognition
2010.5586786. (CVPR), pp. 2261–2269, 2017.

Baluja, S. Hiding images in plain sight: Deep steganogra- Johnson, N. and C. Katzenbeisser, S. A survey of stegano-
phy. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, graphic techniques. 01 1999.
H., Fergus, R., Vishwanathan, S., and Garnett, R. (eds.), Kawaguchi, E., Maeta, M., Noda, H., and Nozaki, K. A
Advances in Neural Information Processing Systems 30, model of digital contents access control system using
pp. 2069–2079. Curran Associates, Inc., 2017. steganographic information hiding scheme. In Proc. of
Boehm, B. StegExpose - A tool for detecting LSB steganog- the 18th Conf. on Information Modelling and Knowledge
raphy. CoRR, abs/1410.6656, 2014. Bases, pp. 50–61, 2007. ISBN 978-1-58603-710-9.
Li, B., He, J., Huang, J., and Shi, Y. A survey on image
Conway, M. Code wars: Steganography, signals intelli-
steganography and steganalysis. Journal of Information
gence, and terrorism. Knowledge, Technology & Pol-
Hiding and Multimedia Signal Processing, 2011.
icy, 16(2):45–62, Jun 2003. ISSN 1874-6314. doi:
10.1007/s12130-003-1026-4. Li, B., Wang, M., Huang, J., and Li, X. A new cost function
for spatial image steganography. In 2014 IEEE Int. Conf.
Douglas, M., Bailey, K., Leeney, M., and Curran, K. An on Image Processing (ICIP), pp. 4206–4210, Oct 2014.
overview of steganography techniques applied to the pro- doi: 10.1109/ICIP.2014.7025854.
tection of biometric data. Multimedia Tools and Applica-
tions, 77(13):17333–17373, Jul 2018. ISSN 1573-7721. Lin, T., Maire, M., Belongie, S. J., Bourdev, L. D., Girshick,
doi: 10.1007/s11042-017-5308-3. R. B., Hays, J., Perona, P., Ramanan, D., Dollár, P., and
Zitnick, C. L. Microsoft COCO: common objects in
Dumitrescu, S., Wu, X., and Memon, N. On steganal- context. CoRR, abs/1405.0312, 2014.
ysis of random lsb embedding in continuous-tone im-
ages. volume 3, pp. 641 – 644 vol.3, 07 2002. doi: Maheswari, S. U. and Hemanth, D. J. Frequency domain
10.1109/ICIP.2002.1039052. qr code based image steganography using fresnelet trans-
form. AEU - International Journal of Electronics and
Dumitrescu, S., Wu, X., and Wang, Z. Detection of LSB Communications, 69(2):539 – 544, 2015. ISSN 1434-
steganography via sample pair analysis. In Information 8411. doi: https://doi.org/10.1016/j.aeue.2014.11.004.
Hiding, pp. 355–372, 2003. ISBN 978-3-540-36415-3.
Pevný, T., Filler, T., and Bas, P. Using high-dimensional im-
Fridrich, J., Goljan, M., and Du, R. Reliable detection of age models to perform highly undetectable steganography.
lsb steganography in color and grayscale images. In Proc. In Information Hiding, 2010.
of the 2001 Workshop on Multimedia and Security: New Reed, I. S. and Solomon, G. Polynomial Codes Over Certain
Challenges, MM&#38;Sec ’01, pp. 27–30. ACM, 2001. Finite Fields. Journal of the Society for Industrial and
ISBN 1-58113-393-6. doi: 10.1145/1232454.1232466. Applied Mathematics, 8(2):300–304, 1960.
Hayes, J. and Danezis, G. Generating steganographic im- Srinivasan, Y., Nutter, B., Mitra, S., Phillips, B., and Ferris,
ages via adversarial training. In NIPS, 2017. D. Secure transmission of medical records using high
capacity steganography. In Proc. of the 17th IEEE Sympo-
He, K., Zhang, X., Ren, S., and Sun, J. Deep residual sium on Computer-Based Medical Systems, pp. 122–127,
learning for image recognition. IEEE Conf. on Computer June 2004. doi: 10.1109/CBMS.2004.1311702.
Vision and Pattern Recognition (CVPR), pp. 770–778,
2016. Tang, W., Tan, S., Li, B., and Huang, J. Automatic
steganographic distortion learning using a generative
Holub, V. and Fridrich, J. Designing steganographic dis- adversarial network. IEEE Signal Processing Letters,
tortion using directional filters. 12 2012. doi: 10.1109/ 24(10):1547–1551, Oct 2017. ISSN 1070-9908. doi:
WIFS.2012.6412655. 10.1109/LSP.2017.2745572.
Holub, V., Fridrich, J., and Denemark, T. Universal dis- Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli,
tortion function for steganography in an arbitrary do- E. P. Image quality assessment: from error visibility to
main. EURASIP Journal on Information Security, 2014 structural similarity. IEEE Trans. on Image Processing,
(1):1, Jan 2014. ISSN 1687-417X. doi: 10.1186/ 13(4):600–612, April 2004. ISSN 1057-7149. doi: 10.
1687-417X-2014-1. 1109/TIP.2003.819861.
SteganoGAN

Westfeld, A. and Pfitzmann, A. Attacks on steganographic


systems. In Information Hiding, pp. 61–76, 2000. ISBN
978-3-540-46514-0.
Wu, P., Yang, Y., and Li, X. Stegnet: Mega image steganog-
raphy capacity with deep convolutional network. Future
Internet, 10:54, 06 2018. doi: 10.3390/fi10060054.

Ye, J., Ni, J., and Yi, Y. Deep learning hierarchical rep-
resentations for image steganalysis. IEEE Trans. on
Information Forensics and Security, 12(11):2545–2557,
Nov 2017. ISSN 1556-6013. doi: 10.1109/TIFS.2017.
2710946.

Zhu, J., Kaplan, R., Johnson, J., and Fei-Fei, L. HiDDeN:


Hiding data with deep networks. CoRR, abs/1807.09937,
2018.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy