This paper considers the research goal of dependable steganalysis: where false positives occur once in a million or less, and this rate is known with high precision. Despite its importance for real-world application, there has been almost no study of steganalysis which produces very low false positives. We test existing and novel classifiers for their low false-positive performance, using millions of images from Flickr. Experiments on such a scale require considerable engineering. Standard steganalysis classifiers do not perform well in a low false-positive regime, and we make new proposals to penalize false positives more than false negatives.
We examine whether steganographic images can be detected more reliably when there exist other images, taken with the same camera under the same conditions, of the same scene. We argue that such a circumstance is realistic and likely in practice. In `laboratory conditions' mimicking circumstances favourable to the analyst, and with a custom set of digital images which capture the same scenes with controlled amounts of overlap, we use an overlapping reference image to calibrate steganographic features of the image under analysis. Experimental results show that the analysed image can be classified as cover or stego with much greater reliability than traditional steganalysis not exploiting overlapping content, and the improvement in reliability depends on the amount of overlap. These results are curious because two different photographs of exactly the same scene, taken only a few seconds apart with a fixed camera and settings, typically have steganographic features that differ by considerably more than a cover and stego image.
The Projected Spatial Rich Model (PSRM) generates powerful steganalysis features, but requires the calculation of tens of thousands of convolutions with image noise residuals. This makes it very slow: the reference implementation takes an impractical 20{30 minutes per 1 megapixel (Mpix) image. We present a case study which first tweaks the definition of the PSRM features, to make them more efficient, and then optimizes an implementation on GPU hardware which exploits their parallelism (whilst avoiding the worst of their sequentiality). Some nonstandard CUDA techniques are used. Even with only a single GPU, the time for feature calculation is reduced by three orders of magnitude, and the detection power is reduced only marginally.
KEYWORDS: Steganography, Distortion, Data modeling, Digital watermarking, Modeling, Steganalysis, Databases, Receivers, Data hiding, Web 2.0 technologies
This work proposes a natural language stegosystem for Twitter, modifying tweets as they are written to hide 4 bits of payload per tweet, which is a greater payload than previous systems have achieved. The system, CoverTweet, includes novel components, as well as some already developed in the literature. We believe that the task of transforming covers during embedding is equivalent to unilingual machine translation (paraphrasing), and we use this equivalence to de ne a distortion measure based on statistical machine translation methods. The system incorporates this measure of distortion to rank possible tweet paraphrases, using a hierarchical language model; we use human interaction as a second distortion measure to pick the best. The hierarchical language model is designed to model the speci c language of the covers, which in this setting is the language of the Twitter user who is embedding. This is a change from previous work, where general-purpose language models have been used. We evaluate our system by testing the output against human judges, and show that humans are unable to distinguish stego tweets from cover tweets any better than random guessing.
The model mismatch problem occurs in steganalysis when a binary classifier is trained on objects from one cover source and tested on another: an example of domain adaptation. It is highly realistic because a steganalyst would rarely have access to much or any training data from their opponent, and its consequences can be devastating to classifier accuracy. This paper presents an in-depth study of one particular instance of model mismatch, in a set of images from Flickr using one fixed steganography and steganalysis method, attempting to separate different effects of mismatch in feature space and find methods of mitigation where possible. We also propose new benchmarks for accuracy, which are more appropriate than mean error rates when there are multiple actors and multiple images, and consider the case of 3-valued detectors which also output `don't know'. This pilot study demonstrates that some simple feature-centering and ensemble methods can reduce the mismatch penalty considerably, but not completely remove it.
Contemporary steganalysis is driven by new steganographic rich feature sets, which consist of large numbers of weak features. Although extremely powerful when applied to supervised classification problems, they are not compatible with unsupervised universal steganalysis, because the unsupervised method cannot separate the signal (evidence of steganographic embedding) from the noise (cover content). This work tries to alleviate the problem, by means of feature extraction algorithms. We focus on linear projections informed by embedding methods, and propose a new method which we call calibrated least squares with the specific aim of making the projections sensitive to stego content yet insensitive to cover variation. Different projections are evaluated by their application to the anomaly detector from Ref. 1, and we are able to retain both the universality and the robustness of the method, while increasing its performance substantially.
This paper introduces a new technique for multi-actor steganalysis. In conventional settings, it is unusual for one actor to generate enough data to be able to train a personalized classi er. On the other hand, in a network there will be many actors, between them generating large amounts of data. Prior work has pooled the training data, and then tries to deal with its heterogeneity. In this work, we use multitask learning to account for di erences between actors' image sources, while still sharing domain (globally-applicable) information. We tackle the problem by learning separate feature weights for each actor, and sharing information between the actors through the regularization. This way, the domain information that is obtained by considering all actors at the same time is not disregarded, but the weights are nevertheless personalized. This paper explores whether multitask learning improves accuracy of detection, by benchmarking new multitask learners against previous work.
This work studies the fundamental building blocks for steganography in H.264 compressed video: the embedding operation and the choice of embedding locations. Our aim is to inform the design of better video steganography, a topic on which there has been relatively little publication so far. We determine the best embedding option, from a small menu of embedding operations and locations, as benchmarked by an empirical estimate of Maximum Mean Discrepancy (MMD) for rst- and second-order features extracted from a video corpus. A highly-stable estimate of MMD can be formed because of the large sample size. The best embedding operation (so-called F5) is identical to that found by a recent study of still compressed image steganography, but in video the options for embedding location are richer: we show that the least detectable option, of those studied, is to spread payload unequally between the Luma and the two Chroma channels.
With most image steganalysis traditionally based on supervised machine learning methods, the size of training
data has remained static at up to 20000 training examples. This potentially leads to the classifier being undertrained
for larger feature sets and it may be too narrowly focused on characteristics of a source of cover images,
resulting in degradation in performance when the testing source is mismatched or heterogeneous. However it is
not difficult to obtain larger training sets for steganalysis through simply taking more photos or downloading
additional images.
Here, we investigate possibilities for creating steganalysis classifiers trained on large data sets using large
feature vectors. With up to 1.6 million examples, naturally simpler classification engines must be used and
we examine the hypothesis that simpler classifiers avoid overtraining and so perform better on heterogeneous
data. We highlight the possibilities of online learners, showing that, when given sufficient training data, they
can match or exceed the performance of complex classifiers such as Support Vector Machines. This applies to
both their accuracy and training time. We include some experiments, not previously reported in the literature,
which provide benchmarks of some known feature sets and classifier combinations.
We consider the problem of universal pooled steganalysis, in which we aim to identify a steganographer who
sends many images (some of them innocent) in a network of many other innocent users. The detector must deal
with multiple users and multiple images per user, and particularly the differences between cover sources used by
different users. Despite being posed for five years, this problem has only previously been addressed by our 2011
paper.
We extend our prior work in two ways. First, we present experiments in a new, highly realistic, domain: up
to 4000 actors each transmitting up to 200 images, real-world data downloaded from a social networking site.
Second, we replace hierarchical clustering by the method called local outlier factor (LOF), giving greater accuracy
of detection, and allowing a guilty actor sending moderate payloads to be detected, even amongst thousands of
other actors sending hundreds of thousands of images.
KEYWORDS: Sensors, Steganography, Steganalysis, Information security, Statistical analysis, Data compression, Binary data, Digital watermarking, Error analysis, Systems modeling
Square root laws state that the capacity of an imperfect stegosystem - where the embedding does not preserve
the cover distribution exactly - grows with the square root of cover size. Such laws have been demonstrated
empirically and proved mathematically for a variety of situations, but not for nonstationary covers. Our aim
here is to examine a highly simplified nonstationary source, which can have pathological and unpredictable
behaviour. Intuition suggests that, when the cover source distribution is not perfectly known in advance, it should
be impossible to distinguish covers and stego objects because the detector can never learn enough information
about the varying cover source. However we show a strange phenomenon, whereby it is possible to distinguish
stego and cover objects as long as the cover source is stationary for two pixels at a time, and then the capacity
follows neither a square root law nor a linear law.
Our work focuses on Feature Restoration (FR), a technique which may be used in conjunction with steganographic
schemes to reduce the likelihood of detection by a steganalyzer. This is done by selectively modifying the stego
image to reduce a given distortion metric to a chosen target feature vector. The technique is independent of the
exact steganographic algorithm used and can be applied with respect to any set of steganalytic features and any
distortion metric. The general FR problem is NP-complete and hence intractable, but randomized algorithms are
able to achieve good approximations. However, the choice of distortion metric is crucial: our results demonstrate
that, for a poorly chosen metric or target, reducing the distortion frequently leads to an increased likelihood of
detection. This has implications for other distortion-reduction schemes.
We propose a new paradigm for blind, universal, steganalysis in the case when multiple actors transmit multiple
objects, with guilty actors including some stego objects in their transmissions. The method is based on clustering
rather than classification, and it is the actors which are clustered rather than their individual transmitted objects.
This removes the need for training a classifier, and the danger of training model mismatch. It effectively judges
the behaviour of actors by assuming that most of them are innocent: after performing agglomerative hierarchical
clustering, the guilty actor(s) are clustered separately from the innocent majority. A case study shows that this
works in the case of JPEG images. Although it is less sensitive than steganalysis based on specifically-trained
classifiers, it requires no training, no knowledge of the embedding algorithm, and attacks the pooled steganalysis
problem.
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly
used in steganalysis. LR offers more information than traditional SVM methods - it estimates class
probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for
multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification
accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and
LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the
state-of-art 686-dimensional SPAM feature set, in three image sets.
Quantitative steganalyzers are important in forensic steganalysis
as they can estimate the payload, or, more precisely, the number of
embedding changes in the stego image. This paper proposes a general
method for constructing quantitative steganalyzers from features used
in blind detectors. The method is based on support vector regression,
which is used to learn the mapping between a feature vector extracted
from the image and the relative embedding change rate. The performance is evaluated by constructing quantitative steganalyzers for eight steganographic methods for JPEG files, using a 275-dimensional feature set. Error distributions of within- and between-image errors are empirically estimated for Jsteg and nsF5. For Jsteg, the accuracy is compared to state-of-the-art quantitative steganalyzers.
WAM steganalysis is a feature-based classifier for detecting LSB matching steganography, presented in 2006 by
Goljan et al. and demonstrated to be sensitive even to small payloads. This paper makes three contributions
to the development of the WAM method. First, we benchmark some variants of WAM in a number of sets of
cover images, and we are able to quantify the significance of differences in results between different machine
learning algorithms based on WAM features. It turns out that, like many of its competitors, WAM is not
effective in certain types of cover, and furthermore it is hard to predict which types of cover are suitable for
WAM steganalysis. Second, we demonstrate that only a few the features used in WAM steganalysis do almost
all of the work, so that a simplified WAM steganalyser can be constructed in exchange for a little less detection
power. Finally, we demonstrate how the WAM method can be extended to provide forensic tools to identify the
location (and potentially content) of LSB matching payload, given a number of stego images with payload placed
in the same locations. Although easily evaded, this is a plausible situation if the same stego key is mistakenly
re-used for embedding in multiple images.
It is a well-established result that steganographic capacity of perfectly secure stegosystems grows linearly with
the number of cover elements-secure steganography has a positive rate. In practice, however, neither the
Warden nor the Steganographer has perfect knowledge of the cover source and thus it is unlikely that perfectly
secure stegosystems for complex covers, such as digital media, will ever be constructed. This justifies study of
secure capacity of imperfect stegosystems. Recent theoretical results from batch steganography, supported by
experiments with blind steganalyzers, point to an emerging paradigm: whether steganography is performed in a
large batch of cover objects or a single large object, there is a wide range of practical situations in which secure
capacity rate is vanishing. In particular, the absolute size of secure payload appears to only grow with the square
root of the cover size. In this paper, we study the square root law of steganographic capacity and give a formal
proof of this law for imperfect stegosystems, assuming that the cover source is a stationary Markov chain and
the embedding changes are mutually independent.
Recent results on the information theory of steganography suggest, and under some conditions prove, that the detectability of payload is proportional to the square of the number of changes caused by the embedding. Assuming that result in general, this paper examines the implications for an embedder when a payload is to be spread amongst multiple cover objects. A number of variants are considered: embedding with and without adaptive source coding, in uniform and nonuniform covers, and embedding in both a fixed number of covers (so-called batch steganography) as well as establishing a covert channel in an infinite stream (sequential steganography, studied here for the first time). The results show that steganographic capacity is sublinear, and strictly asymptotically greater in the case of a fixed batch than an infinite stream. In the former it is possible to describe optimal embedding strategies; in the latter the situation is much more complex, with a continuum of strategies which approach the unachievable asymptotic optimum.
This paper revisits the steganalysis method involving a Weighted Stego-Image (WS) for estimating LSB replacement
payload sizes in digital images. It suggests new WS estimators, upgrading the method's three components:
cover pixel prediction, least-squares weighting, and bias correction. Wide-ranging experimental results (over two
million total attacks) based on images from multiple sources and pre-processing histories show that the new
methods produce greatly improved accuracy, to the extent that they outperform even the best of the structural
detectors, while avoiding their high complexity. Furthermore, specialised WS estimators can be derived
for detection of sequentially-placed payload: they offer levels of accuracy orders of magnitude better than their
competitors.
In Batch Steganography we assume that a Steganographer has to choose how to allocate a fixed amount of data
between a large number of covers. Given the existence of a steganalysis method for individual objects (satisfying
certain assumptions) we assume that a Warden attempts to detect the payload by pooling the evidence from all
the objects. This paper works out the details of a particular method for the Warden, which counts the number
of objects of which the detection statistic surpasses a certain threshold. This natural pooling method leads to a
game between the Warden and Steganographer, and there are different varieties depending on whether the moves
are sequential or simultaneous. The solutions are intriguing, suggesting that the Steganographer should always
concentrate the payload in as few covers as possible, or exactly the reverse, but never adopt an intermediate
strategy. Furthermore, the Warden's optimal strategies are instructive for the benchmarking of quantitative
steganalysis methods. Experimental results show that some steganography and steganalysis methods' empirical
performance accords with this theory.
Quantitative steganalysis aims to estimate the amount of payload in a stego object, and such estimators seem
to arise naturally in steganalysis of Least Significant Bit (LSB) replacement in digital images. However, as with
all steganalysis, the estimators are subject to errors, and their magnitude seems heavily dependent on properties
of the cover. In very recent work we have given the first derivation of estimation error, for a certain method of
steganalysis (the Least-Squares variant of Sample Pairs Analysis) of LSB replacement steganography in digital
images. In this paper we make use of our theoretical results to find an improved estimator and detector. We also
extend the theoretical analysis to another (more accurate) steganalysis estimator (Triples Analysis) and hence
derive an improved version of that estimator too. Experimental results show that the new steganalyzers have
improved accuracy, particularly in the difficult case of never-compressed covers.
We extend our previous work on structural steganalysis of LSB replacement in digital images, building detectors which analyse the effect of LSB operations on pixel groups as large as four. Some of the method previously applied to triplets of pixels carries over straightforwardly. However we discover new complexities in the specification of a cover image model, a key component of the detector. There are many reasonable symmetry assumptions which we can make about parity and structure in natural images, only some of which provide detection of steganography, and the challenge is to identify the symmetries a) completely, and b) concisely. We give a list of possible symmetries and then reduce them to a complete, non-redundant, and approximately independent set. Some experimental results suggest that all useful symmetries are thus described. A weighting is proposed and its approximate variance stabilisation verified empirically. Finally, we apply symmetries to create a novel quadruples detector for LSB replacement steganography. Experimental results show some improvement, in most cases, over other detectors. However the gain in performance is moderate compared with the increased complexity in the detection algorithm, and we suggest that, without new insight, further extension of structural steganalysis may provide diminishing returns.
Quantitative steganalysis refers to the exercise not only of detecting the presence of hidden stego messages in carrier objects, but also of estimating the secret message length. This problem is well studied, with many detectors proposed but only a sparse analysis of errors in the estimators. A deep understanding of the error model, however, is a fundamental requirement for the assessment and comparison of different detection methods. This paper presents a rationale for a two-factor model for sources of error in quantitative steganalysis, and shows evidence from a dedicated large-scale nested experimental set-up with a total of more than 200 million attacks. Apart from general findings about the distribution functions found in both classes of errors, their respective weight is determined, and implications for statistical hypothesis tests in benchmarking scenarios or regression analyses are demonstrated. The results are based on a rigorous comparison of five different detection methods under many different external conditions, such as size of the carrier, previous JPEG compression, and colour channel selection. We include analyses demonstrating the effects of local variance and cover saturation on the different sources of error, as well as presenting the case for a relative bias model for between-image error.
We consider the problem of detecting the presence of hidden data in colour bitmap images. Like straightforward LSB Replacement, LSB Matching (which randomly increments or decrements cover pixels to embed the hidden data in the least significant bits) is attractive because it is extremely simple to implement. It has proved much harder to detect than LSB Replacement because it does not introduce the same asymmetries into the stego image. We expand our recently-developed techniques for the detection of LSB Matching in grayscale images into the full-colour case. Not everything goes through smoothly but the end result is much improved detection, especially for cover images which have been stored as JPEG files, even if subsequently resampled. Evaluation of steganalysis statistics is performed using a distributed steganalysis project. Because evaluation of reliability of detectors for LSB Matching is limited, we begin with a review of the previously-known detectors.
We give initial results from a new project which performs statistically accurate evaluation of the reliability
of image steganalysis algorithms. The focus here is on the Pairs and RS methods, for detection of
simple LSB steganography in grayscale bitmaps, due to Fridrich et al. Using libraries totalling around
30,000 images we have measured the performance of these methods and suggest changes which lead to significant
improvements.
Particular results from the project presented here include notes on the distribution of the RS statistic,
the relative merits of different "masks" used in the RS algorithm, the effect on reliability when previously
compressed cover images are used, and the effect of repeating steganalysis on the transposed image. We also discuss
improvements to the Pairs algorithm, restricting it to spatially close pairs of pixels, which leads to a
substantial performance improvement, even to the extent of surpassing the RS statistic which was previously
thought superior for grayscale images.
We also describe some of the questions for a general methodology of evaluation of steganalysis, and potential
pitfalls caused by the differences between uncompressed, compressed, and resampled cover images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.