Sensors 24 01446
Sensors 24 01446
Article
Entropy Sharing in Ransomware: Bypassing Entropy-Based
Detection of Cryptographic Operations
Jiseok Bang 1 , Jeong Nyeo Kim 2, * and Seungkwang Lee 1, *
Abstract: This study presents a groundbreaking approach to the ever-evolving challenge of ran-
somware detection. A lot of detection methods predominantly rely on pinpointing high-entropy
blocks, which is a hallmark of the encryption techniques commonly employed in ransomware. These
blocks, typically difficult to recover, serve as key indicators of malicious activity. So far, many neu-
tralization techniques have been introduced so that ransomware utilizing standard encryption can
effectively bypass these entropy-based detection systems. However, these have limited capabilities or
require relatively high computational costs. To address these problems, we introduce a new concept
entropy sharing. This method can be seamlessly integrated with every type of cryptographic algorithm
and is also composed of lightweight operations, masking the high-entropy blocks undetectable. In
addition, the proposed method cannot be easily nullified, contrary to simple encoding methods,
without knowing the order of shares. Our findings demonstrate that entropy sharing can effectively
bypass entropy-based detection systems. Ransomware utilizing such attack methods can cause
significant damage, as they are difficult to detect through conventional detection methods.
collect information about victims, and store critical data related to the ransomware attack.
However, some ransomware attacks do not rely on C&C infrastructure and instead limit
themselves to host detection capabilities. (3) The encryption phase consists of several stages,
including key generation, searching for target files with specific extensions, encryption, and
potential deletion or overwriting of backups. Ransomware employs different encryption
methods, such as symmetric and asymmetric ciphers. Symmetric encryption is favored
for its speed in encrypting large data volumes, while asymmetric encryption protects
the symmetric key. Ransomware employs tactics like overwriting or renaming original
files, saving encrypted files in new locations, or temporarily moving and restoring files
during encryption. (4) Once the files are fully or partially encrypted, ransomware enters
the extortion phase. During this phase, ransomware creates a ransom note, typically in
the form of a text or HTML file, providing instructions to the victim on how to retrieve
their data.
1.3. Contributions
The next generation of ransomware could potentially evade current detection methods
by using encryption techniques that produce moderate-level entropy. However, selecting an
algorithm that reduces the ciphertext’s entropy might not align with ransomware business
objectives, as it could increase the chances of successful decryption of the plaintext. In this
paper, we introduce an efficient and effective method for bypassing ransomware detection.
Sensors 2024, 24, 1446 3 of 31
Our approach presents a new threat model for ransomware, which leverages standard
encryption algorithms. This model is designed to maintain the balance between evading
detection and preserving the robustness of the encryption, thereby adhering to the core
goals of ransomware operations. The main contributions of this paper are as follows:
• We propose an entropy reduction technique, aptly named entropy sharing , that can
be applied to the output of both symmetric and asymmetric encryption algorithms
commonly utilized in ransomware. Before introducing entropy sharing, we also outline
a basic concept of simple bit decomposition aimed at achieving minimal entropy levels.
• Through the frequency test defined in the NIST randomness test suit, we demonstrate that
the proposed method can effectively bypass existing entropy-based detection techniques.
• We present a decoding approach named entropy recomposition, which is designed
to be applied to the output of entropy sharing. This process is followed by decryp-
tion, facilitating the restoration of the victim’s files. Unlike other encoding methods,
a distinctive feature is that decoding is impossible if the order of entropy shares
is unknown.
• We evaluate the overhead of the proposed method when combined with encryption
algorithms and assess their impact on the total computation time. The results show
that there is minimal change in the efficiency of ransomware attacks, allowing for the
rapid corruption of a large number of files.
The rest of this paper is structured as follows: Section 2 offers a comprehensive
overview of current ransomware detection and neutralization strategies. Section 3 details
our innovative method designed to obscure cryptographic operations in ransomware. This
involves a novel encoding technique that transforms high-entropy blocks into blocks with
lower or medium entropy. In Section 4, we present our experimental findings. These
experiments demonstrate the effectiveness of entropy sharing in ransomware encryption and
evaluate the additional overhead incurred. Our primary approach for assessing entropy
randomness involves the use of NIST frequency tests, which are specifically applied to
the data written on the file system. Section 5 focuses on analyzing the results of entropy
recomposition at different ratios, which are aimed at countering ransomware that employs
entropy sharing. This section also explores the entropy characteristics of write blocks
following simple bit decomposition and evaluates the accuracy in distinguishing between
encrypted and non-encrypted files using Shannon entropy values. The paper concludes
with Section 6, summarizing our findings and contributions.
the ransomware process upon its interaction with these files. Mehnaz and Mudgerikar [34]
also used the decoy approach for early ransomware detection and prevention. Moreover,
relying solely on decoy-based detection does not ensure that ransomware will target the
decoy files first, thereby placing the victim’s data at considerable risk [35,36].
Entropy has been widely used as a metric in data-centric approaches since it tends to
rise when a file is encrypted. Numerous studies have utilized entropy calculations, such as
Shannon entropy, which quantifies data uncertainty, to identify ransomware threats [35].
For example, Nolen Scaife’s team [37] used Shannon entropy to examine modifications
in files when accessed. The Shannon entropy of a byte array can be computed using
the formula:
255
1
e = ∑ Pi log2
i =0
Pi
Here , Pi is the relative frequency of a byte value i occurring in the array, given by Fi /n,
where n is the total bytes to be analyzed, and Fi is the number of appearance of i such that
n = ∑255i =0 Fi . The computed result ranges from 0 to 8, with 8 denoting a perfectly balanced
distribution of byte values in the array. Due to the uniform probability distribution in
encrypted files, they often approach the maximum entropy value of 8. The method uses
statistical analysis to identify changes in a user’s file structure before and after access and
also employs a similarity metric based on the concept that successful encryption results in
a distinctly different version of the file.
Kharraz et al. proposed a comparable detection method, called UNVEIL [6], where
they examined the dynamic I/O buffer content and measured the difference in Shannon
entropy between read and write operations. In addition to analyzing the generic I/O
access patterns of ransomware, they identified two indicators of ransomware detection: a
significant increase in entropy between read and write data buffers at a specific file offset
or the creation of new high-entropy files. This observation is crucial because, even when
ransomware overwrites original files with low entropy blocks to securely delete them, it
must generate an encrypted version of the original files. This process inevitably leads to
the generation of high-entropy data during ransomware attacks.
Similarly, REDEMPTION [38], like UNVEIL, calculates the Shannon entropy of the
data buffers associated with each read and write request to a file. By comparing the
entropy values of read and write requests from the same file offset, it becomes a powerful
indicator of ransomware activity. REDEMPTION calculates a malicious score for each
process that requests privileged operations, including factors such as the ratio of modified
blocks in a file and an increase in entropy, as a true positive signal of ransomware detection.
Therefore, an increase in entropy can be considered an important metric for ransomware
detection. However, here we note that just relying on the calculated Shannon entropy
value to distinguish between encrypted and non-encrypted files would be a difficult task
generating a lot of false positives and negatives [39]. In Section 5, we show our experimental
result on this issue.
The NIST randomness test suite can also be used for a similar purpose to identify
suspicious cryptographic operations that result in the writing of high-entropy blocks in the
file system, as in the case of Rcryptect [1]. This test suite includes various tests developed
to assess the randomness of binary sequences. The entropy of binary sequences is tested
based on the assumptions of uniformity and scalability. The test suite compares the test
statistic value computed on the target binary sequence to a critical value determined from
a reference distribution of the statistic under the assumption of randomness. If the test
statistic value surpasses the critical value, the null hypothesis (H0) that the sequence is
random is rejected. Otherwise, H0 is accepted. For instance, the frequency test provides
the most basic evidence of non-randomness and is used to assess entropy levels in this
case. Algorithm 1 explains the frequency test taking a byte sequence buf with size bytes;
Table 1 summarizes the notation used in the algorithm. Contrary to Shannon entropy,
the frequency test can distinguish between non-encrypted and encrypted blocks with an
overwhelming probability [1].
Sensors 2024, 24, 1446 6 of 31
Notation Description
α Significance level set to 0.01, indicating a 1% probability threshold for the test
buf Input buffer containing the binary sequence to test
size The number of elements in buf
mask Binary mask initialized to 0x80 to isolate bits in a byte
num_0 s Counter for the number of 0 s in the sequence
num_1 s Counter for the number of 1 s in the sequence
Sobs Computed statistic for the observed discrepancy between the number of 0 s and 1 s
p − values The probability that the observed balance of 0 s and 1 s could occur by chance
γ Frequency test result; 0 for imbalance, 1 for balance.
As an alternative, FPE can be utilized. FPE is an encryption method that maintains the
same format for plaintext and ciphertext, thus keeping the entropy after encryption similar
to that of plaintext. In [40], the FF1 algorithm was used to circumvent entropy-based ran-
somware detection using the characteristics of FPE. However, this can reduce the speed of
ransomware’s encryption attack due to its high computational complexity. More specifically,
either FF1 or FPEs based on prefix cipher, cycle-walking cipher, and generalized-Feistel
cipher involve repeated execution of block ciphers like AES in their internal operations to
preserve the format of the plaintext. Therefore, compared to the encoding-based neutraliza-
tion methods, it significantly reduces the efficiency of ransomware attacks.
Another neutralization method is the use of intermittent encryption. This method
encrypts only parts of a file, reducing the increase in entropy of the file after encryption.
However, to enhance the efficiency of ransomware attacks using intermittent encryption,
the more the encryption area is reduced, the greater the possibility of file restoration
becomes. In other words, leaving a large portion of the files unencrypted means that for
some file formats, we can extract data from the non-encrypted parts of the files and recover
some of the data from there [41]. On the other hand, as the proportion of the encrypted area
within a file increases, the overall entropy also rises, thereby heightening the likelihood
of detection.
Ci 7→ b0 b1 b2 b3 b4 b5 b6 b7 ,
Here, the crucial observation is that, as shown in Section 4, this simple encoding
can be easily defeated by detecting frequent appearances of lowest-entropy blocks in
the file systems. In the following, our proposed encoding method solves this problem
by converting high-entropy blocks of cryptographic outcomes into mid-level blocks of
benign files.
Sensors 2024, 24, 1446 8 of 31
Encryption ℰ(𝒦, 𝒫)
𝒞 ⋯ 𝐶𝑖 ⋯
𝒢 ( ∙ , ℬ𝑓 )
Entropy
⊕
Sharing
𝑆𝑖 1 𝑆𝑖 [2] ⋯ 𝑆𝑖 [𝑚] 𝑎𝑖
(a)
Decryption 𝒞 ⋯ 𝐶𝑖 ⋯
𝒟(𝒦, 𝒞)
𝒫
(b)
Figure 1. Overview of ransomware attack (a) and restore (b) using entropy sharing and recompo-
sition, respectively. (a) Entropy sharing following encryption; (b) entropy recomposition followed
by decryption.
Sensors 2024, 24, 1446 9 of 31
Notation Description
P Plaintext
C Ciphertext
K Secret key
E (K, P ) Encryption taking K and P
D(K, C) Decryption taking K and C
Bf Benign file, where f ∈ F
G Byte-sequence generator of benign-level entropy
taking a subbyte of C and B f
m Order of entropy shares
V Victim’s original file
V Victim’s encrypted file
For an n-byte ciphertext C computed by E (K, P ), entropy sharing takes each subbyte
Ci∈{1,n} and divides it into m + 1 shares. Then, we have an n × m byte stream consisting of
n sequences of Si [1]||Si [2]|| . . . ||Si [m], which exhibits a non-random distribution, thereby
providing a benign-level of entropy. To achieve this purpose, let us assume that there exists
a generator G(Ci , B f ), which splits Ci into m + 1 shares (Si [1], Si [2], . . . , Si [m], ai ) such that
m
M
Ci = Si [ j ] ⊕ a i ,
j =1
where B f is a reference file packaged within the ransomware. In other words, G extracts
a byte stream with a benign level of entropy from B f and splits Ci into the m + 1 shares.
Finally, entropy sharing replaces Ci with the G ’s output and writes it to the victim’s file.
Figure 2 describes this overall process of entropy sharing.
𝑚+1
𝒱 ⋯ ⋯ 𝑃1 ⋯ 𝑃𝑛
ℰ(𝒦, 𝒫)
𝐶1 ⋯ 𝐶𝑖 ⋯
⊕
𝑚
𝑖 ⋮
ℬ𝑓
Ransomware today often encrypts only a specific set of file types that are commonly
used and vital in both personal and business settings. Attackers use encryption to take
crucial data hostage, demanding ransom from victims for decryption keys. By focusing on
these particular file formats, ransomware aims to impact many users, thereby increasing the
probability of receiving ransom payments. Let F represent this set of file types, including
{.jpg, .pdf, .pptx, .docx, .mp3, .mp4, .txt, .zip, etc.}. When a file of type f ∈ F is targeted, G
Sensors 2024, 24, 1446 10 of 31
uses a reference file B f to produce an entropy level similar to what is typically seen in files
of type f .
In simpler terms, G sequentially reads m bytes from B f for each Ci . Considering that
the size of ransomware-targeted files might be larger than that of the benign files B f , these
are handled as if they were in a circular queue-like structure. Since ai is not predominant
in terms of entropy within the entire m + 1 bytes, the resulting encoded output exhibits
the entropy levels of benign files. This similarity poses a significant challenge for current
detection methods to distinguish between files held hostage by ransomware and original
files (detailed discussion in Section 4).
𝑚+1
𝒱 ⋯ ⋯ 𝑃1 ⋯ 𝑃𝑛
𝒟(𝒦, 𝒞)
𝐶1 ⋯ 𝐶𝑖 ⋯
𝑆1 1 𝑆1 [2] ⋯ 𝑆1 [𝑚] 𝑎1
⊕
𝑆𝑖 1 𝑆𝑖 [2] ⋯ 𝑆𝑖 [𝑚] 𝑎𝑖
Next, the n-byte ciphertext C can be obtained by repeating n times. Lastly, D(K, C)
gives us P .
The proposed scheme involves simple XOR operations to the existing standard cryp-
tographic functions and thus has little impact on the computational cost of ransomware
operation. In the following section, we will provide a more detailed explanation based on
various experiments.
Sensors 2024, 24, 1446 11 of 31
4. Evaluation
In this section, we investigate the impact of entropy sharing on encrypted samples
by using the AES-128 algorithm. We omit experiments involving other cryptographic
algorithms for entropy sharing and recomposition, as various standard ciphers, including
asymmetric key algorithms, used in ransomware, are known to produce similar entropy
patterns in their blocks [1]. Our analysis focuses on assessing the pass rate and p-values
of the frequency test for the original files, the resulting ciphertexts, and their encoded
outputs obtained through entropy sharing. Furthermore, we provide an evaluation of the
computational costs involved. Please take note of the analysis of the impact of simple
bit decomposition in Section 5.2. In a concise summary, it is observed that simple bit
decomposition yields negligible p-values due to the encoding of each byte in 8-byte values,
which possess the HW of only 8.
K
1 0x67C6697351FF4AEC29CDBAABF2FBE346
2 0x7CC254F81BE8E78D765A2E63339FC99A
3 0x66320DB73158A35A255D051758E95ED4
4 0xABB2CDC69BB454110E827441213DDC87
5 0x70E93EA141E1FC673E017E97EADC6B96
6 0x8F385C2AECB03BFB32AF3C54EC18DB5C
7 0x021AFE43FBFAAA3AFB29D1E6053C7C94
8 0x75D8BE6189F95CBBA8990F95B1EBF1B3
Sensors 2024, 24, 1446 12 of 31
The original files within set F are determined to exhibit non-random results in the
frequency test. However, when subjected to encryption using the AES-128 algorithm, they
transform into random binary sequences regardless of the file type. Nevertheless, entropy
sharing on the encrypted samples reveals a significant reduction in entropy. Figures 4 and 5
depicts the pass rates and average p-values of the frequency test for each type of original
sample files, encrypted files under K#1, and the encoded outputs across different orders,
respectively. Note that a comprehensive collection of pass rates, obtained by applying
eight distinct secret keys, can be located in Appendix A. Notably, as the order m of entropy
shares increases, the entropy diminishes visibly. This observation underscores that en-
tropy sharing can effectively circumvent existing entropy-based detection of cryptographic
operations in ransomware. Figure 6 presents the average pass rates on the outcomes of
applying entropy sharing to encrypted files when different secret keys, as shown in Table 3,
are injected into AES. Figure 6a illustrates this in a three-dimensional representation, while
Figure 6b projects the results onto a two-dimensional plane by overlaying the eight graphs.
An intriguing observation here is that despite changing the secret key, there is a slight
variation in the pass rates.
Pass Rate
(a)
Pass Rate
(b)
Figure 4. Cont.
Sensors 2024, 24, 1446 13 of 31
Pass Rate
(c)
Pass Rate
(d)
Figure 4. Pass rates of the frequency test on 100 binary sequences across four sample types using K#1.
ORG: original sample files. Here, m = 0 represents encrypted files. (a) mp3; (b) jpg; (c) pdf; (d) zip.
Average P-value
(a)
Average P-value
(b)
Figure 5. Cont.
Sensors 2024, 24, 1446 14 of 31
Average P-value
(c)
Average P-value
(d)
Figure 5. Average P-values for the frequency test on 100 binary sequences across four sample types
using K#1. (a) mp3; (b) jpg; (c) pdf; (d) zip.
100
80 100
60 80
60
40
40
20 20
0
0
key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8 key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8
(a) (b)
0.6
0.5
0.6
0.4
0.4
0.3
0.2
0.2
0.1 0
0
key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8 key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8
(c) (d)
Figure 6. Average pass rates and p-values for frequency test on 100 binary sequences for each JPG
sample file using each of 8 different keys. (a) Overlapping average pass rates; (b) average pass rates;
(c) overlapping average p-values; (d) average p-values.
Sensors 2024, 24, 1446 15 of 31
5. Discussion
In this section, we explore the effectiveness and limitations of entropy recomposition as
a countermeasure against entropy sharing in ransomware. We note the pitfalls of simple
bit decomposition in reducing file entropy. Additionally, we discuss the difficulty in
distinguishing encrypted from non-encrypted files using Shannon entropy. Finally, we
consider the memory requirements and detection issues related to ransomware attacks that
utilize entropy sharing.
𝑆1 1 𝑆1 [2] ⋯ 𝑆1 [𝑚] 𝑎1
⊕
𝑆𝑖 1 𝑆𝑖 [2] ⋯ 𝑆𝑖 [𝑚] 𝑎𝑖
𝐶1 ⋯ 𝐶𝑖 ⋯ ⋯ 𝐶1 ⋯ 𝐶𝑖 ⋯
Entropy-based Detection
Figure 7. Illustration of potential detection scenario for ransomware cryptographic operations with
entropy sharing.
However, there are two key considerations to address in this approach. First, due to the
unknown order of entropy shares selected by ransomware, accurately deducing it proves
challenging, requiring the use of an arbitrary compression ratio r:1 for recomposition. While
a ratio of r = m + 1 has a high likelihood of detecting cryptographic operations by decoding
input blocks back to their original ciphertext, different scenarios where r ̸= m + 1 require
empirical investigation to understand their implications. Second, although encoded blocks
from entropy sharing tend to exhibit higher entropy when compressed correctly, the impact
of entropy on non-encrypted files also becomes crucial under arbitrary compression ratios
of r:1. If this leads to the generation of blocks with increased entropy, it can lead to a rise in
false positives, subsequently thereby affecting the overall accuracy of the detection system.
To address these concerns, a series of experiments were conducted. For each original
file among the set of 100 sample jpg files, we performed the following procedures for
various r ∈ {2, 4, 6, 8, 10}:
• XOR compression was applied with a r:1 ratio.
• AES encryption using K#1 was performed, followed by r:1 XOR compression.
• Entropy sharing was applied with m = 3 after the AES encryption using K#1, followed
by r:1 XOR compression.
The average of the pass rates for the frequency test was computed for each case.
The experimental results, depicted in Figure 8, yield several key insights: Encrypted
files (where m = 0) display randomness independent of the compression ratio r. For
encoded outcomes (where m = 3), randomness is observed when r ≥ m + 1. Non-encrypted
original files (ORG) start exhibiting a significantly higher level of randomness beginning
at a compression ratio r = 4. Notably, even at r = 2, some blocks are already identified
as random. This suggests that during recomposition at a given ratio r, if r = m + 1, the
encoded blocks are accurately decoded back to blocks of the original ciphertext. However,
as r increases, the non-encrypted blocks exhibit increasingly higher entropy, leading to
a significant false positive rate. For this reason, the inability to ascertain the exact order
of entropy sharing renders precise decoding unfeasible and results in a high rate of false
positives, thereby hindering the effective operation of ransomware detection systems.
Sensors 2024, 24, 1446 17 of 31
(a)
(b)
(c)
Figure 8. Average pass rates of frequency test vs. compression ratios r on original, encrypted, and
encoded files. (a) ORG; (b) m = 0; (c) m = 3.
Sensors 2024, 24, 1446 18 of 31
Table 6. Average and standard deviation (S.D.) of Shannon entropy values for original, encrypted,
and encoded files.
m
ORG
0 1 3 5 7 9
Shannon entropy avg. 7.935655 7.998874 7.995119 7.98972 7.987368 7.985997 7.985153
S.D. 0.127836 0.005177 0.000277 0.00038 0.000335 0.000238 0.000202
not easy to be distinguishable from legitimate file operations (due to the benign level of
entropy), checking I/O access patterns and entropy leads to a decrease in the true positive
rate for detecting ransomware.
Author Contributions: Conceptualization, S.L.; methodology, S.L. and J.B.; software, J.B.; valida-
tion, S.L. and J.B.; formal analysis, S.L.; investigation, J.B.; resources, J.B.; writing—original draft
preparation, J.B.; writing—review and editing, J.N.K.; visualization, J.B.; supervision, J.N.K.; project
administration, J.N.K.; funding acquisition, J.N.K. All authors have read and agreed to the published
version of the manuscript.
Funding: This work was supported by Korea Research Institute for defense Technology planning and
advancement(KRIT) grant funded by the Korea government(DAPA(Defense Acquisition Program
Administration)) (No. 20-107-A00-005-001, Cyber threat context awareness-based active response
technology for defense against the spread of cyber battlefield attacks, 2023).
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Data are contained within the article.
Conflicts of Interest: The authors declare no conflicts of interest.
consistent with the analysis provided in Section 4. Lastly, Figures A9 and A10 tell us that
the key has nearly no effect on the pass rates and p-values of the frequency test.
Pass Rate
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure A1. Pass rates of frequency test with 8 different keys on MP3 samples. (a) K#1; (b) K#2;
(c) K#3; (d) K#4; (e) K#5; (f) K#6; (g) K#7; (h) K#8.
Pass Rate
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure A2. Pass rates of frequency test with 8 different keys on JPG samples. (a) K#1; (b) K#2; (c) K#3;
(d) K#4; (e) K#5; (f) K#6; (g) K#7; (h) K#8.
Pass Rate
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure A3. Pass rates of frequency test with 8 different keys on PDF samples. (a) K#1; (b) K#2;
(c) K#3; (d) K#4; (e) K#5; (f) K#6; (g) K#7; (h) K#8.
Pass Rate
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure A4. Pass rates of frequency test with 8 different keys on ZIP samples. (a) K#1; (b) K#2; (c) K#3;
(d) K#4; (e) K#5; (f) K#6; (g) K#7; (h) K#8.
Average P-value
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure A5. p-values averages of frequency test with 8 different keys on MP3 samples. (a) K#1; (b) K#2;
(c) K#3; (d) K#4; (e) K#5; (f) K#6; (g) K#7; (h) K#8.
Average P-value
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure A6. p-values averages of frequency test with 8 different keys on JPG samples. (a) K#1; (b) K#2;
(c) K#3; (d) K#4; (e) K#5; (f) K#6; (g) K#7; (h) K#8.
Average P-value
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure A7. p-values averages of frequency test with 8 different keys on PDF samples. (a) K#1; (b) K#2;
(c) K#3; (d) K#4; (e) K#5; (f) K#6; (g) K#7; (h) K#8.
Average P-value
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Figure A8. p-values averages of frequency test with 8 different keys on ZIP samples. (a) K#1; (b) K#2;
(c) K#3; (d) K#4; (e) K#5; (f) K#6; (g) K#7; (h) K#8.
100
80 100
80
60
60
40 40
20
20 0
0
key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8 key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8
(a) (b)
100
80 100
60 80
60
40
40
20 20
0
0
key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8 key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8
(c) (d)
100
80 100
80
60
60
40 40
20
20 0
0
key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8 key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8
(e) (f)
100
80 100
80
60
60
40 40
20
20 0
0
key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8 key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8
(g) (h)
Figure A9. Two-dimensional (2D) and 3D visualization of pass rate averages across 8 different keys
for frequency test on each sample type. (a) Overlapping average MP3 pass rates; (b) average MP3
pass rates; (c) overlapping average JPG pass rates; (d) average JPG pass rates; (e) overlapping average
PDF pass rates; (f) average PDF pass rates; (g) overlapping average ZIP pass rates; (h) average ZIP
pass rates.
0.6
0.5
0.6
0.4
0.4
0.3
0.2
0.2
0.1 0
0
key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8 key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8
(a) (b)
0.6
0.5
0.6
0.4
0.4
0.3
0.2
0.2
0.1 0
0
key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8 key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8
(c) (d)
0.6
0.5
0.6
0.4
0.4
0.3
0.2
0.2
0.1 0
0
key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8 key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8
(e) (f)
0.6
0.5
0.6
0.4
0.4
0.3
0.2
0.2
0.1 0
0
key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8 key 1 key 2 key 3 key 4 key 5 key 6 key 7 key 8
(g) (h)
Figure A10. Two-dimensional (2D) and 3D visualization of p-values averages across 8 different keys
for frequency test on each sample type. (a) Overlapping average MP3 p-values; (b) average MP3
p-values; (c) overlapping average JPG p-values; (d) average JPG p-values; (e) overlapping average
PDF p-values; (f) average PDF p-values; (g) overlapping average ZIP p-values; (h) average ZIP
p-values.
References
1. Lee, S.;Jho, N.-s.; Chung, D.; Kang, Y.; Kim, M. Rcryptect: Real-time detection of cryptographic function in the user-space
filesystem. Comput. Secur. 2022, 112, 102512. [CrossRef]
2. Oz, H.; Aris, A.; Levi, A.; Uluagac, A.S. A Survey on Ransomware: Evolution, Taxonomy, and Defense Solutions. ACM Comput.
Surv. 2022, 54, 238. [CrossRef]
3. Ahmed, J.; Gharakheili, H.H.; Russell, C.; Sivaraman, V. Automatic detection of DGA-enabled malware using SDN and traffic
behavioral modeling. IEEE Trans. Netw. Sci. Eng. 2022, 9, 2922–2939. [CrossRef]
4. Andronio, N.; Zanero, S.; Maggi, F. HelDroid: Dissecting and Detecting Mobile Ransomware. In Proceedings of the 18th
International Symposium, RAID 2015, Kyoto, Japan, 2–4 November 2015; Bos, H., Monrose, F., Blanc, G., Eds.; Springer: Cham,
Switzerland, 2015; pp. 382–404. [CrossRef]
5. Palisse, A.; Le Bouder, H.; Lanet, J.L.; Le Guernic, C.; Legay, A. Ransomware and the Legacy Crypto API. In Proceedings of the
11th International Conference, CRiSIS 2016, Roscoff, France, 5–7 September 2016; Cuppens, F., Cuppens, N., Lanet, J.L., Legay, A.,
Eds.; Springer: Cham, Switzerland, 2017; pp. 11–28. [CrossRef]
6. Kharaz, A.; Arshad, S.; Mulliner, C.; Robertson, W.; Kirda, E. UNVEIL: A Large-Scale, Automated Approach to Detecting
Ransomware. In Proceedings of the 25th USENIX Security Symposium (USENIX Security 16), Austin, TX, USA, 10–12 August 2016;
pp. 757–772. Available online: https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/kharaz
(accessed on 11 November 2023).
7. Ahmadian, M.M.; Shahriari, H.R. 2entFOX: A framework for high survivable ransomwares detection. In Proceedings of the 2016
13th International Iranian Society of Cryptology Conference on Information Security and Cryptology (ISCISC), Tehran, Iran, 7–8
September 2016; pp. 79–84. [CrossRef]
8. Alhawi, O.; Baldwin, J.; Dehghantanha, A. Leveraging Machine Learning Techniques for Windows Ransomware Network Traffic
Detection. In Cyber Threat Intelligence; Springer: Berlin/Heidelberg, Germany, 2018; pp. 93–106. [CrossRef]
9. Cohen, A.; Nissim, N. Trusted detection of ransomware in a private cloud using machine learning methods leveraging meta-
features from volatile memory. Expert Syst. Appl. 2018, 102, 158–178. [CrossRef]
10. Maniath, S.; Ashok, A.; Poornachandran, P.; Sujadevi, V.; Sankar A.U., P.; Jan, S. Deep learning LSTM based ransomware detection.
In Proceedings of the 2017 Recent Developments in Control, Automation & Power Engineering (RDCAPE), Noida, India, 26–27
October 2017; pp. 442–446. [CrossRef]
11. Cusack, G.; Michel, O.; Keller, E. Machine Learning-Based Detection of Ransomware Using SDN. In Proceedings of the
CODASPY’18: Eighth ACM Conference on Data and Application Security and Privacy, Tempe, AZ, USA, 21 March 2018;
SDN-NFV Sec’18. pp. 1–6. [CrossRef]
12. Homayoun, S.; Dehghantanha, A.; Ahmadzadeh, M.; Hashemi, S.; Khayami, R.; Choo, K.K.R.; Newton, D.E. DRTHIS: Deep
ransomware threat hunting and intelligence system at the fog layer. Future Gener. Comput. Syst. 2019, 90, 94–104. [CrossRef]
13. Homayoun, S.; Dehghantanha, A.; Ahmadzadeh, M.; Hashemi, S.; Khayami, R. Know Abnormal, Find Evil: Frequent Pattern
Mining for Ransomware Threat Hunting and Intelligence. IEEE Trans. Emerg. Top. Comput. 2020, 8, 341–351. [CrossRef]
14. Nissim, N.; Lapidot, Y.; Cohen, A.; Elovici, Y. Trusted system-calls analysis methodology aimed at detection of compromised
virtual machines using sequential mining. Knowl.-Based Syst. 2018, 153, 147–175. [CrossRef]
15. Rhode, M.; Burnap, P.; Jones, K. Early-stage malware prediction using recurrent neural networks. Comput. Secur. 2018, 77, 578–594.
[CrossRef]
16. Vinayakumar, R.; Soman, K.; Senthil Velan, K.; Ganorkar, S. Evaluating shallow and deep networks for ransomware detection and
classification. In Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics
(ICACCI), Udupi, India, 13–16 September 2017; pp. 259–265. [CrossRef]
Sensors 2024, 24, 1446 30 of 31
17. Wan, Y.L.; Chang, J.C.; Chen, R.J.; Wang, S.J. Feature-Selection-Based Ransomware Detection with Machine Learning of Data
Analysis. In Proceedings of the 2018 3rd International Conference on Computer and Communication Systems (ICCCS), Nagoya,
Japan, 27–30 April 2018; pp. 85–88. [CrossRef]
18. Zhang, H.; Xiao, X.; Mercaldo, F.; Ni, S.; Martinelli, F.; Sangaiah, A.K. Classification of ransomware families with machine
learning based onN-gram of opcodes. Future Gener. Comput. Syst. 2019, 90, 211–221. [CrossRef]
19. Daku, H.; Zavarsky, P.; Malik, Y. Behavioral-Based Classification and Identification of Ransomware Variants Using Machine
Learning. In Proceedings of the 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing and
Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), New York,
NY, USA, 1–3 August 2018; pp. 1560–1564. [CrossRef]
20. Bai, J.; Wang, J. Improving malware detection using multi-view ensemble learning. Secur. Commun. Netw. 2016, 9, 4227–4241.
[CrossRef]
21. Krawczyk, B.; Minku, L.L.; Gama, J.; Stefanowski, J.; Woźniak, M. Ensemble learning for data stream analysis: A survey. Inf.
Fusion 2017, 37, 132–156. [CrossRef]
22. Jabbar, M.A.; Aluvalu, R.; Reddy, S.S.S. Cluster Based Ensemble Classification for Intrusion Detection System. In Proceedings
of the 9th International Conference on Machine Learning and Computing, Singapore, 24–26 February 2017; ICMLC 2017.
pp. 253–257. [CrossRef]
23. Parikh, D.; Polikar, R. An Ensemble-Based Incremental Learning Approach to Data Fusion. Trans. Sys. Man Cyber. Part B 2007,
37, 437–450. [CrossRef]
24. Rhee, J.; Riley, R.; Lin, Z.; Jiang, X.; Xu, D. Data-Centric OS Kernel Malware Characterization. IEEE Trans. Inf. Forensics Secur.
2014, 9, 72–87. [CrossRef]
25. Alqahtani, A.; Sheldon, F.T. A Survey of Crypto Ransomware Attack Detection Methodologies: An Evolving Outlook. Sensors
2022, 22, 1837. [CrossRef] [PubMed]
26. Al-Rimy, B.A.S.; Maarof, M.A.; Shaid, S.Z.M. Redundancy Coefficient Gradual Up-weighting-based Mutual Information Feature
Selection Technique for Crypto-ransomware Early Detection. Future Gener. Comput. Syst. 2021, 115, 641–658. [CrossRef]
27. Abukar, Y.; Koçer, B.; Huda, S.; Al-Rimy, B.; Hassan, M. A system call refinement-based enhanced Minimum Redundancy
Maximum Relevance method for ransomware early detection. J. Netw. Comput. Appl. 2020, 167, 102753. [CrossRef]
28. Al-Rimy, B.A.S.; Maarof, M.A.; Alazab, M.; Alsolami, F.; Shaid, S.Z.M.; Ghaleb, F.A.; Al-Hadhrami, T.; Ali, A.M. A Pseudo
Feedback-Based Annotated TF-IDF Technique for Dynamic Crypto-Ransomware Pre-Encryption Boundary Delineation and
Features Extraction. IEEE Access 2020, 8, 140586–140598. [CrossRef]
29. Urooj, U.; Maarof, M.; Al-rimy, B. A proposed Adaptive Pre-Encryption Crypto-Ransomware Early Detection Model. In
Proceedings of the 2021 3rd International Cyber Resilience Conference (CRC), Langkawi Island, Malaysia, 29–31 January 2021;
pp. 1–6. [CrossRef]
30. Olaimat, M.N.; Aizaini Maarof, M.; Al-rimy, B.A.S. Ransomware Anti-Analysis and Evasion Techniques: A Survey and Research
Directions. In Proceedings of the 2021 3rd International Cyber Resilience Conference (CRC), Langkawi Island, Malaysia, 29–31
January 2021; pp. 1–6. [CrossRef]
31. Moore, C. Detecting ransomware with honeypot techniques. In Proceedings of the 2016 Cybersecurity and Cyberforensics
Conference (CCC), Amman, Jordan, 2–4 August 2016; pp. 77–81. [CrossRef]
32. Song, S.; Kim, B.; Lee, S. The Effective Ransomware Prevention Technique Using Process Monitoring on Android Platform. Mob.
Inf. Syst. 2016, 2016, 1–9. [CrossRef]
33. Gomez-Hernandez, J.; Álvarez González, L.; García-Teodoro, P. R-Locker: Thwarting ransomware action through a honeyfile-
based approach. Comput. Secur. 2017, 73, 389–398. [CrossRef]
34. Mehnaz, S.; Mudgerikar, A.; Bertino, E. RWGuard: A Real-Time Detection System Against Cryptographic Ransomware. In
Lecture Notes in Computer Science; Bailey, M., Holz, T., Stamatogiannakis, M., Ioannidis, S., Eds.; Springer: Berlin/Heidelberg,
Germany, 2018; Volume 11050, pp. 114–136. [CrossRef]
35. Morato, D.; Berrueta, E.; Magaña, E.; Izal, M. Ransomware early detection by the analysis of file sharing traffic. J. Netw. Comput.
Appl. 2018, 124, 14–32. [CrossRef]
36. Monge, M.A.S.; Vidal, J.M.; Villalba, L.J.G. A Novel Self-Organizing Network Solution towards Crypto-Ransomware Mitigation.
In Proceedings of the 13th International Conference on Availability, Reliability and Security, Hamburg, Germany, 27–30 August
2018; ARES 2018. [CrossRef]
37. Scaife, N.; Carter, H.; Traynor, P.; Butler, K.R.B. CryptoLock (and Drop It): Stopping Ransomware Attacks on User Data. In
Proceedings of the 36th IEEE International Conference on Distributed Computing Systems, ICDCS 2016, Nara, Japan, 27–30 June
2016; pp. 303–312. [CrossRef]
38. Kharraz, A.; Kirda, E. Redemption: Real-Time Protection Against Ransomware at End-Hosts. In Research in Attacks, Intrusions,
and Defenses; Dacier, M., Bailey, M., Polychronakis, M., Antonakakis, M., Eds.; Springer: Cham, Switzerland, 2017; pp. 98–119.
[CrossRef]
39. Davies, S.R.; Macfarlane, R.; Buchanan, W.J. Comparison of Entropy Calculation Methods for Ransomware Encrypted File
Identification. Entropy 2022, 24, 1503. [CrossRef] [PubMed]
40. Lee, J.; Lee, S.Y.; Yim, K.; Lee, K. Neutralization Method of Ransomware Detection Technology Using Format Preserving
Encryption. Sensors 2023, 23, 4728. [CrossRef] [PubMed]
Sensors 2024, 24, 1446 31 of 31
41. Novick, A. White Phoenix: Beating Intermittent Encryption. CYBERARK. Available online: https://www.cyberark.com/
resources/threat-research-blog/white-phoenix-beating-intermittent-encryption (accessed on 9 December 2023).
42. Rukhin, A.; Soto, J.; Nechvatal, J.; Smid, M.; Barker, E. A Statistical Test Suite for Random and Pseudorandom Number Generators for
Cryptographic Applications; Special Publication (NIST SP); National Institute of Standards and Technology: Gaithersburg, MD,
USA, 2001; Volume 800, p. 163. Available online: https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=906762 (accessed on
9 December 2023).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.