0% found this document useful (0 votes)
32 views20 pages

Real Time Call Based Ransomware

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views20 pages

Real Time Call Based Ransomware

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

International Journal of Information Security

https://doi.org/10.1007/s10207-024-00819-x

REGULAR CONTRIBUTION

Real-time system call-based ransomware detection


Christopher Jun Wen Chew1 · Vimal Kumar1 · Panos Patros2 · Robi Malik2

© The Author(s) 2024

Abstract
Ransomware, particularly crypto ransomware, has emerged as the go-to malware for threat actors aiming to compromise data
on Android devices as well as in general. In this paper, we present a ransomware detection technique based on behaviours
observed in the system calls performed by the malware. We first describe our repeatable and extensible methodology for
extracting the system call log and patterns. We then identify and present some common high-level system call behavioural
patterns exhibited by crypto ransomware, and evaluate these patterns. We further describe the implementation of a streaming
implementation that utilises regular expressions for modelling malware behaviours and finite state machines for detecting
crypto ransomware behaviours in real time. The success of our proof of concept evaluation allows us to envision our proposed
technique applied as part of a self-protection system on Android phones against malware.

Keywords Crypto ransomware · System calls · Behaviour · Patterns · Android · Real-time

1 Introduction 2020 and expected to grow to 7.49 billion by 2025 [63].


The prevalence of mobile phones can often be attributed to
Ransomware attacks, one of the more frequent types of the conveniences they offer, such as communication appli-
attacks, pose a threat to both consumers and organisa- cations, digital wallets, and entertainment applications. As
tions. According to Sophos The State of Ransomware 2023 our mobile devices continue to be closely intertwined in our
report [61], 84% of the organisations surveyed in Singapore daily lives, they not only store our personal data such as
were affected by ransomware in 2022. Ransomware heav- photographs, credit card information, and contacts but also
ily impacts businesses monetarily. The report stated that the sensitive business and organisational information through
average ransom payment was $1,542,333 in 2023, almost apps such as mail, cloud storage and business applications.
doubled from the previous year of $812,380. Consequently, mobile phones have evolved into a high value
Due to the severity of ransomware threats, there has been and portable data storage, making them valuable targets to
extensive research in ransomware detection and mitigation ransomware and malware attacks. The combination of the
[3, 51]. However, there is a notable gap in addressing crypto current state of ransomware [19], the trend of mobile device
ransomware detection with a focus on overcoming resource usage and the value of data on mobile devices will see mobile
constraints in mobile devices. This is particularly concern- devices inevitably become one of the main targets of ran-
ing because of the steady increase in the global usage of somware attacks. The mobile device market is dominated
mobile phones, reaching 6.95 billion devices surveyed in by two operating systems, Android and iOS. Of the two,
Android holds about 70% of the mobile operating system
B Christopher Jun Wen Chew market share [64]. In addition to the official Google Play
cc246@students.waikato.ac.nz
Store, Android has various third-party app stores available
Vimal Kumar to download apps, in contrast to iOS. The third-party app
vimal.kumar@waikato.ac.nz
stores are much less strictly controlled than the official app
Robi Malik store making users more susceptible to downloading poten-
robi@waikato.ac.nz
tially malicious applications and therefore being introduced
1 Department of Computer Science, University of Waikato, to a larger threat surface. Hence, there is a need for more
Hamilton, New Zealand Android-based anti-ransomware solutions.
2 Department of Software Engineering, University of Waikato,
Hamilton, New Zealand

123
C. J. W. Chew et al.

Techniques with varying levels of sophistication and This paper is an extended version of our previous work
robustness, such as the ones based on machine learning [22, ESCAPADE [13] in which we addressed Research Objec-
26], and those based on behaviour [33, 56] have recently tives 1 to 3. Additionally, in this paper we address research
been used for identification of malware. Additionally, there objective 4, where we apply the methodology of ESCAPADE
have been approaches looking at system calls for dynamically in a streaming implementation for detecting crypto ran-
analysing malware. System calls-based approaches offer a somware. This work also discusses real-time detection as
balance between user-level and kernel-level analysis. User- opposed to offline detection as discussed in [13]. The paper
level analysis is often unable to capture the behaviour of more is organised as follows. Section 2 details the background
sophisticated malware variants. Kernel-level analysis offers and relevant research in our work with Sects. 2.3 and 2.3.4
more depth and resilience; however, the approaches can often being extended to include relevant work in our new proposed
lead to a complex design, thus leading to an over-fitted solu- approach. Section 2.4 describes the methodology utilised in
tion, which may not provide any significant benefits in the our previous work to acquire our behavioural patterns. Sec-
detection model. tion 3 presents the new proposed approach, which describes
Due to the dynamically changing and ever-evolving threat the design and architecture of our approach for real-time
landscape, malware has become more sophisticated and cun- crypto ransomware detection. In Sect. 4, we detail our
ning. Hence, to counteract this rapidly changing landscape, extended evaluation of the behavioural patterns and the pro-
malware detection systems have trended towards real-time posed streaming approach along with some potential threats
malware analysis [1, 46] and self-protection systems [32, to validity with our work. In Sect. 4.5 we address the limi-
57], which provide benefits, such as early to immediate tations in our work and propose some future improvements.
detection and consistent active monitoring. In this work, Finally, in Sect. 5, we give an overview of the initial research
we apply the advantages of real-time malware analysis and objectives and how they were achieved.
the aforementioned benefits of system call-level analysis
to dynamically identify behavioural patterns of crypto ran-
somware. By leveraging these two techniques, we aim to 2 Background and related work
address the following research objectives (RO):
In this section, we detail the evolution and improvements
– RO1: Identify system call-level behavioural patterns for of Android security and its current state; followed by an
crypto ransomware: while there have been recent works overview of different types of ransomware, and conclude
on pattern detection on system call logs [33, 40], none with the different types of malware analysis techniques used
has focused on patterns produced by specific malware throughout the years, and how our proposed approach can
types. We aim to discover a set of common behavioural contribute to the existing area.
patterns for crypto ransomware, such as file encryption
and tampering with user files through the use of system 2.1 Android security history
call logs.
– RO2: Evaluate the effectiveness of the behavioural pat- Since the introduction of Android—a mobile operating
terns: we also evaluate the feasibility and efficacy of system—in 2008, there have been many updates and improve-
these patterns in detecting crypto ransomware behaviours ments to its security. In 2012, Bouncer was released in an
from different families, to discover the shared common effort to deter the upsurge of Android malware in the pre-
behaviour among crypto ransomware. ceding year [47]. Bouncer targeted pre-existing applications
– RO3: Generate and make available, a dataset of sys- as well as new applications. The approach that Bouncer took
tem call logs of malware activity: we believe behaviour was sandboxing [41], where applications were executed, and
detection using system calls can be a useful technique for scanned for malware in an isolated environment on a cloud
malware detection and analysis, therefore we have made infrastructure, which was devoid of any access to the users’
our dataset available for researchers to utilise in malware real data.
research. However, researchers quickly detected the vulnerabilities
– RO4: Implement and evaluate a real-time streaming of Bouncer. Oliva Hou from Trend Micro [30] noted that
implementation for crypto ransomware detection: this researchers were able to acquire specific details of the run-
research objective aims to introduce a streaming imple- time environment, such as the duration of Bouncer’s testing
mentation for detecting crypto ransomware in real-time, phase (which was five minutes), and the phone contents used
through the utilisation of token Finite State machines in the simulated environment (two photos, one contact and
(FSMs). In addition, we evaluate the efficacy and feasi- the Google account). These details could easily be exploited
bility of this new proposed approach for detecting crypto by attackers through the use of simple obfuscation techniques
ransomware. to avoid detection by Bouncer.

123
Real-time system call-based ransomware detection

Bouncer was, therefore, not a sustainable security mecha- cross-sectors and 39% of retail sectors (from surveyed retail
nism. A few years later in May 2017, a more robust approach IT managers) were successful in preventing ransomware
known as Play Protect was introduced. In addition to the attacks from encrypting their data. Such values implied a
introduction of Play Protect, a security Application Pro- higher demand for preventative measures, specifically target-
gramming Interface (API) called SafetyNet Verify Apps was ing crypto ransomware; reinforcing our decision to develop
introduced in September of the same year. This API aimed an approach, which focused on crypto ransomware.
to address three key ideas: to help further protect users from
malicious applications, determine if a user’s device is pro- 2.3 Static and dynamic analysis
tected by Play Protect, and prompt users to enable Play
Protect if it is disabled. In static analysis, a malware analyst or computer program,
observes the code of the given application and tries to deter-
2.2 Ransomware mine if it is malicious or benign, and gains insight on its
functionality without the necessity of executing the applica-
Ransomware, a type of malware that holds the users’ tion. Static analysis, however, has limited effectiveness when
data for ransom—often requesting monetary payment— more sophisticated malware utilises advanced techniques,
generally consist of two types: locker ransomware and such as binary/code/control flow obfuscation, and polymor-
crypto ransomware [48]. Locker-type ransomware tradition- phic coding [17, 20, 49] to avoid detection.
ally displays a persistent screen that prevents the user from In dynamic analysis, rather than observing the code,
interacting with the rest of the system. This screen will malicious applications are directly executed in an isolated
often display the ransom note demanding monetary pay- environment and observed over time for malicious behaviour.
ment. On mobile devices, specifically Android, locker-type As such, dynamic analysis is more resilient to obfuscation
ransomware makes the application persistent by displaying techniques, which is a common limitation of static analysis
a perpetual alert dialog or activity, or disabling interactions [28, 29]. Generally, obfuscation in dynamic analysis is not
with the navigation bar [7]. Another technique used is altering an issue as dynamic analysis only observes the behaviour
users’ lock screens, thus preventing access to their devices of the application at run-time. However, obfuscation tech-
[4, 34]. niques to circumvent existing dynamic analysis approaches
Crypto ransomware are more destructive where the user’s have also been explored in past literature. A dynamic anal-
files are encrypted to prevent them from accessing any of ysis obfuscation technique of particular relevance is system
their data [4, 35]. Similar to locker-type ransomware, a ran- call obfuscation. Srivastava et al. [62] proposed a system call
som note is often displayed after the encryption phase has obfuscation technique by simulating an Illusion attack that
been completed. Typically, for crypto ransomware, the pro- utilises an Alternative System Call Execution Path (ASEP)
cess begins by scanning the user’s personal directories, such and the ioctl system call to obfuscate malicious behaviour.
as Documents, and Pictures for files. Once the scanning The proposed method showed that it was possible to mas-
phase has completed, the ransomware often identifies files querade the behaviours performed by malicious applications
containing specific extensions, such as .docx, .png, and .jpg as the system calls invoked through the use of ioctl. This
to encrypt. This method is normally used to speed up the is difficult to discern from benign applications due to the
encryption process, and efficiently determine the important marshalling process.
user files to encrypt (i.e. the files most important to a user) Over the years, researchers have developed unique and
[21]. For the encryption process, the data of the identified robust techniques through the use of static and dynamic anal-
files are read, and written to a new encrypted file with an ysis aimed at detecting malware or intrusion detection. The
unknown file extension. The original file is then removed or following subsections highlight the core areas surrounding
overwritten [12]. our work and how we differentiate our approach from other
In recent years, the trend of ransomware attacks have related work in this area.
shifted, with crypto ransomware being the more common
attacks as compared to locker-type ransomware [5, 45, 53]. 2.3.1 Signature and code analysis
One of the more recent works by Bansal [15], further high-
lighted this trend, by reviewing the most common variants One of the more traditional types of static analysis focuses on
of ransomware attacks, with Cryptolocker and WannaCry developing signatures or observing source code. AndroSim-
showing the highest percentage of attacks, both of which ilar [18] and DroidMoss [70] adopted the idea of fuzzy
were crypto ransomware. hashing which compared similarities between the signa-
The aforementioned issue was exemplified by the low mit- tures generated. This produced a percentage of similarity
igation rates for crypto ransomware. For example, in a 2021 with 100% being an exact match. This approach aimed to
State of Ransomware report by Sophos [60], only 34% of counteract the issue of code obfuscation and application

123
C. J. W. Chew et al.

repackaging. However, AndroSimilar [18] produced high strated increased resilience against common obfuscation
false negative rates (28%) when detecting unknown mal- techniques, it still remains susceptible to obfuscation. As
ware and considerably higher false negatives for the various mentioned previously system call obfuscation has been
methods of code obfuscation which consisted of method shown to be possible by the Illusion attack described in [62],
renaming (45%), junk method insertion (44%), goto obfusca- but the implementation of such an attack is demonstrably
tion (43%), and string encryption (24%). DroidMoss’s false complex compared to static signature and code obfuscation,
negative rates were lower (10.7%). All the tested applica- which can often be more easily achieved [52, 69]. As a conse-
tions however, came from third-party app stores, whereas quence, to the best of our knowledge, system call obfuscation,
AndroSimiliar focused on both official Play store and third- while possible, has not yet been observed in the wild in
party app stores. malware. Furthermore, traditional static signature and code
Compton et al. [14] have also attempted to mitigate the analysis are now understood to be insufficient for detect-
issues of obfuscation through the use of code2vec [2], a ing newer and more sophisticated malware variants [23].
method used for extracting information from an Abstract State-of-the-art tools have transitioned more towards the use
Syntax Tree (AST) derived from a piece of source code, of a combination of static and dynamic analysis, such as
to train models on obfuscated Java source code to reduce Android’s official anti-malware system, Play Protect, which
code2vec’s reliance of variables names. In their evaluation, statically analyses the application upon installation as well as
they utilised 7 datasets to determine if obfuscated variables observing the application’s behaviour using machine learn-
names provided an improved model for identifying code ing algorithms.
semantics. One of those datasets was based on Android mal-
ware APKs, which showed minimal improvements with the 2.3.2 Taint analysis
newly trained models as malware are known to utilise sophis-
ticated obfuscated techniques to avoid detection. Taint analysis, is a method of observing data flow and
Traditional code and signature analysis techniques are tainting sensitive data paths that could potentially be used
known to be effective against known malware. There are, maliciously. One of the earlier works of taint analysis was
however, evident limitations of these techniques as men- TaintDroid [16], which utilised variable-level tracking of
tioned in Sect. 2.3. The aforementioned works emphasise native methods within the Dalvik VM interpreter, which con-
some of the techniques adopted by researchers to counter- tained taint markings in a taint map. These taint markings
act the limitations. However, one of the core limitations of were propagated through the Android Inter-Process Com-
utilising signature and code analysis stems from the inabil- munication Binder, based on the defined data flow rules on
ity to detect newer and unknown variants of malware, which how the application used the tainted data, to the untrusted
becomes a major issue as the malware landscape continues application’s taint map. If the untrusted application made a
to evolve. library call deemed as a taint sink (e.g. network send), then
Many have attempted to observe Android malware or the application was marked as malicious.
ransomware, such as Maiorca et al. [44] discusses an In contrast, under our method of detection, we observed
Android ransomware approach, which observed Android high-level behavioural patterns at a system call-level with
application’s bytecode to determine if an application was a each pattern classified in different levels of severity. This
ransomware. This work was further extended by incorporat- allowed for more precise details regarding an application’s
ing system API-related information to improve the efficacy behaviour and more flexibility.
of the proposed approach [55]. MaMaDroid [50] utilises Similarly, FlowDroid [9] also adopts the idea of taint anal-
machine learning and generates Markov Chain models on ysis. They proposed a static analysis approach, which utilised
the application’s call graph from bytecode to detect Android flow-sensitive taint analysis through the use of Control-Flow
malware. Whereas, Amer and El-Sappagh [6] incorporates Graphs (CFGs), that modelled the life-cycle of Android and
deep learning and the abstraction of API or system calls to call-back methods. FlowDroid’s approach offers a unique
detect Android malware and ransomware. Their proposed and precise detection rate, however, due to the fully static
approach further extends to detect unknown malware. approach it shares similar limitations to other static analysis
The work proposed in this paper primarily employs approaches. For example, FlowDroid was only able to cap-
dynamic analysis. Our methodology adopted aspects of sig- ture reflective calls if the arguments were defined as string
nature detection, such as the comparison of behavioural constants, which was not always the case, as noted in their
patterns, while also being resilient to the aforementioned lim- limitations. Conversely, we adopted a dynamic approach by
itation as we capture and detect the high-level behavioural observing the behaviour of crypto ransomware in real-time,
patterns in real-time. While dynamic analysis has demon- which alleviated the aforementioned limitation.

123
Real-time system call-based ransomware detection

2.3.3 System call analysis reconstruct the high-level behaviour of Android malware. In
contrast, our approach showed a higher-level of explainabil-
System calls are often been used for kernel-level malware ity for malware behaviour through our two layered FSMs,
analysis. Works in [33, 40, 66] apply system call analysis on which captured both individual behaviours and behaviours
mobile operating systems, such as Android. This approach is occurring in specific sequences, thus enabling us to observe a
useful because system calls are able to determine the precise more general overview of ransomware behaviour and under-
operations that occurred during the execution of an applica- stand why an application would be marked as malicious.
tion or program, which can help identify malicious activities
or behaviours. We further contribute to this area by cap- 2.3.4 Real-time malware analysis
turing high-level behavioural patterns exhibited by crypto
ransomware. A work that focuses on utilising real-time malware detec-
One drawback, however, with system call monitoring tion is DNADroid [22], which adopted a hybrid approach
is the large quantity of information generated. Due to by utilising static and dynamic modules to detect Android
background processes—such as clock_gettime() that ransomware. For DNADroid’s static module, features are
periodically record the system clock time—occurring in par- extracted from the Android Application Package (APK),
allel with the core operations, the information generated from such as permission requests, words, terms, and images com-
monitoring an application, is large. monly used in ransomware screens. These features are then
Isohara et al. [33] addressed this issue by filtering out processed by machine learning models and given a malware
unnecessary system calls. They achieved this by grouping score (between 0 and 1).
system calls into specific categories and filtered processes For DNADroid’s dynamic module, they utilised a sandbox
unrelated to the application through the use of a process tree. environment to capture the API call sequences, which were
For their detection phase, Isohara et al. [33] created 16 differ- pre-processed by removing common API calls sequences
ent patterns represented as regular expressions. These regular utilised in both benign and malicious applications. After pre-
expressions utilised assistant keywords, which relate to spe- processing, DNADroid utilised Multiple Sequence Align-
cific strings such as, file paths or commands such as su. ment (MSA) for aligning multiple extracted strands of API
The work of Isohara provides a good insight into pat- call sequences to acquire the common malicious DNA sub-
tern detection in system call logs using regular expressions. sequences. These modules are utilised by the real-time
Our proposed approach improves on this notion by intro- detection module, which determines if an application is mali-
ducing a formalised methodology, which converts relevant cious or benign. To achieve this, the static classifier scores
system calls into tokens and utilises behavioural patterns, the application between 0 and 1 (benign or malicious) based
represented as a 2-layer token FSMs, for real-time detection on the trained model. If an application contains a score higher
of crypto ransomware. than the threshold (1-confidence score of application), then
SCSDroid [40] is a thread-grained behavioural pattern the dynamic component is utilised to extract the common
detection method on the system call-level leveraging the DNA subsequences using MSA. These extracted DNA sub-
Longest Common Subsequence (LCS) algorithm to extract sequences are compared against other previously extracted
potentially malicious patterns from system calls. The Bayes DNA subsequences using Binary Subsequence Alignment
theorem is then utilised with these patterns to determine if (similar to MSA except the comparison is only between two
an application was a Maliciously Repackaged Application sequences). If the sequence matches, then the application
(MRA) or a benign application. is deemed malicious otherwise, the application continues to
The proposed approach of SCSDroid gives a good per- execute within the dynamic environment in 5-min intervals
spective of the feasibility of pattern detection used in malware until a malicious sequence match is detected.
detection. However, as noted in their conclusion, one of DNADroid provides a detection system through the utili-
the limitations is its inability to detect unknown families sation of Machine Learning and Sequence Alignment tech-
that have not been acquired (i.e. trained). In comparison, in niques. On the contrary, our proposed prototype produces
our approach, we utilise behavioural patterns represented as similar effective results for detection rates through the util-
Finite State Machines (FSMs) to match common behaviour isation of 2-layer token FSMs without the reliance on ML
and behavioural sequences based on a range of ransomware models and a sandbox environment.
families in a stream of system calls. This allows us to capture Semantic aWare andrOid malwaRe Detector (SWORD)
a broad range of behavioural patterns in real-time as opposed [11] creates sequential System Call Graphs using Markov
to family-specific patterns. Chains to acquire the typical paths exhibited by malware.
One of the more prominent works of system call analysis Once the typical paths are obtained, statistical analysis
was CopperDroid [66], which utilised value-based data flow is applied using Average Logarithmic Branching Factor
analysis on system call sequences and IPC unmarshalling to (ALBF) to acquire numerical representations of the typical

123
C. J. W. Chew et al.

paths. After applying statistical analysis, supervised machine The methodology of S 2 A2 D E clustered same system calls
learning (Random Forest) is used on the training dataset to based on the arguments to identify the different ways the
classify applications as malware or benign. same system calls can be used (i.e. an open system call can
The methodology proposed by SWORD applied tech- be used to read a file with the read-only flag (O_RDONLY) or
niques, such as Machine Learning and information theory, read and write to a file with the read-write flag (O_RDWR). To
to create a runtime detection system. However, one of the model the program flow, S 2 A2 D E utilises Markov Chains to
main limitations mentioned by SWORD is the cumulative observe sequences of clustered system calls enabling them
overhead produced by different components, with an aver- to identify and characterise the program’s behaviour.
age of 13,802.57 s to process all components, and 2401.82 s S 2 A2 D E applied the aforementioned methodology in a
as the fastest completion time. With our proposed architec- prototype implementation to show the feasibility of the pro-
ture, the processing times are significantly less through the posed approach as an Intrusion Detection System (IDS).
lightweight design of utilising regular expressions and token This prototype was further improved on in later works,
conversion. which focused on reducing the false positive rates [43].
Sun et al. [65] adopts a similar approach, utilising systems However, the clustering of system calls generates noticeable
calls for real-time malware detection. The first process is performance issues with 700MB of memory usage on the
initialisation, which generated resources files upon the first worst-case scenario. Additionally, the clustering and detec-
execution of the Android application. The second process, tion times are slower compared to our proposed approach
dynamic behaviour detection, adds a hook to the kernel to with a worst-case scenario of 12 s for clustering and 12.9 s for
acquire system calls. detection as noted in their evaluation. Figure 1 summarises
For their Static Application Analyse process [65], the the notable features and dataset used in each related work dis-
application’s permissions and APIs were extracted with the cussed within this section with the addition of our proposed
decompiler tool known as ApkTool [68]. These were utilised approach (Table 1).
in a preparation phase, where all applications statistics were
acquired, such as number of malware applications using
permissions, and number of benign applications not using 2.4 Behavioural pattern methodology
permissions.
The Malware Application Identification process, utilised Figure 1 provides an overview of our behavioural pattern col-
naive Bayes to identify if an application was benign or mali- lection. The sandbox environment component is our run-time
cious by sending the application to a server; this extracted environment where applications are examined; this environ-
the log file and acquired static information (permissions ment is described in more detail in Sect. 2.5. The first phase
and APIs). Once extracted, the probabilities were calculated is the Observation phase where applications are observed
using chi-square to determine if an application’s requested for their behaviour during runtime. After which, we manu-
permissions were related to the application’s behaviour. The ally derived behavioural patterns using regular expressions
calculated probabilities were used to determine if an appli- based on the benign and malicious behaviours observed dur-
cation was benign or malicious. Additionally, their system ing that phase. Section 2.6 provides the detail process of
traversed through the created behavioural graph to identify acquiring these patterns. These patterns are then converted
and reconstruct malicious file operation, network operation, into our token representation for pattern matching.
and IPC call behaviours. The tokens are used in our second phase, labelled as
As stated in their discussion, one of the limitations of Evaluation. This phase starts with the extraction of the raw
the standard dynamic analysis approaches is the potentially system calls logs collected from our sandbox environment,
extensive analysis time. This issue is more evident in larger which is then applied with multiple layers of filtering to
applications with multiple traversable branches. By compar- abstract and remove repetitive or unrelated system calls.
ison, our proposed prototype, which observes the application After which, the filtered log is formatted for pattern match-
in real-time as it executes, alleviates the lengthy analysis time ing using our created tokens. This process is repeated for
required. all unique variants containing a unique hash—also known as
S 2 A2 D E [42] proposed a host-based intrusion detection a sample—resulting in the final dataset, which contains the
system (HIDS) using system call sequence clustering and formatted system call logs and detected patterns.
Markov Chains for modelling system call sequence to detect The following subsections extensively describe our method-
anomalous activity, specifically focusing on buffer overflow ology of collecting and formatting system call logs for
attacks. Their work expanded and improved on a preexisting detection of malware in more detail. The methodology
IDS known as SyscallAnomaly, which generated profiles of proposed enables researchers to utilise a streamlined and
system calls based on the arguments [37] to identify the nor- reproducible approach to safely extract system call logs for
mal behaviour of a program. effective pattern-based malware detection.

123
Real-time system call-based ransomware detection

Table 1 Table summarising related work key features and datasets used
Name Approach Notable features Dataset True positive (%) F1 score (%)

AndroSimilar [18] Static Efficient processing 5993 Google Play samples 80.7 41.7
3309 Malicious samples
5139 third-party app store samples
DroidMoss [70] Static Efficient processing 1200 third-party apps (200 from 89.3 –
each app store)
Compton et al. [14] Static Efficient processing 24,000 Malicious samples 44.9 –
TaintDroid [16] Dynamic Behavioural Analysis 1100 Android Market samples – –
Real-time Detection
FlowDroid [9] Static Efficient processing SecuriBench micro 96.7 90
Isohara et al. [33] Dynamic System call sequences 230 applications – –
SCSDroid [40] Dynamic System call sequences 49 Malicious samples 96 94.1
Behavioural analysis 100 Benign samples
Efficient processing
CopperDroid [66] Dynamic System call sequences Android malware genome dataset 73 –
Behavioural analysis Contagio mobile dataset
McAfee dataset
Total 2900 samples
DNADroid [22] Static/Dynamic Behavioural analysis 1928 Malicious samples 98.1 92.1
Real-time detection 2500 Benign samples
Crypto ransomware
SWORD [11] Dynamic System Call Sequence 1000 Malicious samples 95.8 89.2
1000 Benign samples
Sun et al. [65] Dynamic Behavioural analysis 122 Malicious samples 85.2 85.9
166 Benign samples
S 2 A2 D E [42] Dynamic System call sequences IDEVAL dataset 100 –
Behavioural analysis
Real-time detection
Finite state machines
This work Dynamic System call sequences 213 Malicious samples 100 99.2
Behavioural analysis 502 Benign samples
Real-time detection
Efficient processing
Finite state machines
Crypto ransomware

Fig. 1 Methodology process


overview

Malicious Phase 1: Observation Analyst


Application phase
Sandbox Raw System Manual
Environment Call Log Observation
Non-Malicious
Application
Phase 2:
Evaluation
Regular Dataset of formatted
Expressions system call logs and
Filtering
detected patterns
L1 L2 LN

Tokens
Formatting

Formatted
System Call Pattern Matching
Logs

123
C. J. W. Chew et al.

high-level behaviours. The common high-level behavioural


patterns were derived from manual observation of the system
call logs. Furthermore, the samples used within our pattern
observations phase are excluded from our dataset of mali-
cious applications to avoid any potential bias within our
evaluation phase in Sect. 4. During the observation phase,
we were able to discover 12 behavioural patterns. We classi-
fied the behavioural patterns in three categories, five of these
patterns are classified as Malicious, four are classified as Sus-
picious, and three are General behavioural patterns.

Fig. 2 Overview of system call log collection process 2.7 Pattern acquisition and classification

Our method of acquiring the patterns was based on our


2.5 System call log collection deduction in the observation phase. This was achieved by
going through each application and identifying malicious (or
The first part of our approach is the collection of system call potentially malicious) behaviour and its respective high-level
logs. To achieve this, we devised an automatic process of system call counterpart via the captured log. Our aim is to
installing applications and tracing system call logs. The envi- observe common high-level behavioural patterns specifically
ronment we used was a Google Pixel 2 emulator running API focusing on crypto ransomware. However, not all captured
level 24, created on Android Studio. To automate the process behavioural patterns correlate to malicious behaviour. For
of installing applications and starting applications, we used example, consider the creation of a socket to connect to
Android Debug Bridge [25] (ADB) and Android Monkey an external URL to transfer specific resources. This type
[27], a program used for generating events on an applica- of behaviour occurs in both benign and malicious applica-
tion. To acquire the system call logs, we ran strace [39], a tions. However, the usage will differ. A malicious application
command line tool originally utilised on Linux, to extract and often uses that connection to contact a Command and Con-
capture the system calls from each application during run- trol (C&C) server [54] to download the payload, whereas a
time. The parent process (Zygote) was traced to ensure we benign application would use the connection to download
capture all behaviours produced by the applications. Figure 2 resources; often occurring in applications requiring frequent
provides an abstract overview of this process. updates, such as online mobile games, or linking accounts
During the observation phase, we noticed that Android such as social media accounts. Therefore, to aid in distin-
ransomware often prompts for admin privileges. Hence, we guishing the behaviour of patterns, we created a classification
automatically accepted the requested permissions for each to better represent the patterns detected.
application. Additionally, to simulate a real-user experience, Patterns in the Malicious category are explicitly classified
we used Android Monkey to insert events periodically during as malicious behaviours. Applications that contain Malicious
the application’s runtime. This is described in more detail in patterns contain malicious segments that resemble behaviour
Sect. 2.6. of crypto ransomware. Behavioural patterns classified in the
Suspicious category are deemed as potentially malicious.
2.6 Acquisition of behavioural patterns These types of patterns can lead to malicious behaviour.
However, the behaviour by itself does not indicate any mal-
To acquire a set of high-level common behavioural pat- ice. Patterns in the General category are common benign
terns for crypto ransomware, a pilot test was conducted by behaviours that exist in malicious and benign applications
evaluating 10 crypto ransomware samples from five fam- with low indication of malicious behaviour.
ilies obtained from CICAndMal2017 [38] and Koodous Note: Suspicious and General patterns are not used in
[36]. Each application was executed 10 times and manually our evaluations. These patterns were primarily identified
observed during runtime to comprehensively acquire their and created to aid future detection systems that utilise com-
malicious behaviour. Additionally, 10 benign samples were mon high-level behaviour. Furthermore, crypto ransomware
also analysed to observe the differences in behaviour. exhibits distinct malicious behavioural patterns unlike other
The five ransomware families used for our pattern obser- types of malware, such as Adware and Trojans, where the
vation phase consisted of: WannaLocker, DoubleLocker, malicious behaviours are not always immediately evident.
SimpleLocker, Filecoder, and Wipelocker. All samples were The inclusion of these two pattern categories will be more
evaluated from each of these families to acquire our common beneficial in those types of malware.

123
Real-time system call-based ransomware detection

2.7.1 Malicious patterns Unlinking User File

Our first malicious pattern observed from the logs was (\d+) Process ID
related to file renaming and unlinking within the user’s
main directory (Rename & Unlink File). This behaviour was
(\d+\:\d+\:\d+) Timestamp
observed in the WannaLocker sample, which renamed the
initial encrypted file using an unknown file extension. Once
the file extension has changed, the ransomware proceeded to
\bUnlinkat\b System call operation
unlink the user’s original file that was related to the encrypted
file. We only looked for this pattern in files within the user
directory or external directory (SDcard) as these directories /Storage/emulated/0 User directory location
are the points of interest for crypto ransomware due to the
importance of the files residing within them (often important
to the users, such as photos, notes, and other important docu- (.(\w+)\.(\w+).*) File type
ments, but not required for the system to work) [58]. The main
system call sequences observed, began with renameat,
followed by an fstat, which always occurred before an Fig. 3 Abstract view of representing ‘Unlinking of user files’ malicious
pattern using regular expressions
unlinkat operation.
The next malicious pattern from our observations was
unlinking of users’ files. This behaviour is normally exhib-
ited by crypto ransomware after the file encryption process our current methodology of automating the applications with
has occurred [24, 31]. From our analysis, we found consistent randomly simulated interactions.
occurrences of this pattern in both benign and ransomware The last two common malicious patterns discovered were
samples during our observation phase. However, in the reading of user files and writing to a file with an unknown
benign samples, the unlinked files were application specific file extension. These two behavioural patterns represented
(i.e. within the application’s directory) and were unrelated the encryption segment of a crypto ransomware. This was a
to the user specific directories. There are, however, spe- common behaviour that occurred in all of our ransomware
cific benign applications, such as cache-cleaning application, logs.
which can unlink files within the user directories and cause The first pattern that represents the encryption compo-
potential false positives. This issue is further discussed in nent is Read User File. This pattern focuses on capturing the
Sect. 4.3.3. The sequence for this pattern began with an behaviour of applications continuously reading three times
unlinkat system call followed by the location of the user from a file within the user directory. From our observation
directory, and the type of file removed. phase, some of the malicious variants observed read the con-
Another malicious behavioural pattern discovered was the tents of files within the user directory over multiple read
creation of files with unknown file extensions within the operations in a specific block size, unlike the benign samples,
user’s main directory (Unknown File Ext Created). From the which read the file contents in one single block. Hence, the
different samples observed, this was a prevalent behaviour inclusion of three read operations; this is to filter out appar-
for crypto ransomware where a new file was created to hold ent benign applications. The sequence of this pattern begins
the encrypted data of the original user’s file. This encrypted with an openat system call followed by the location of the
file was in a nonstandard file extension and the file name user directory then three read operations.
consisted of the original file’s name including its original The second pattern of the encryption component is
file extension. The main sequence of tokens for this pat- Write File Unknown Extension. This pattern observed the
tern started with an openat system call followed by the behaviour of applications writing data to a newly created
user directory token, then searched for any files created not file with an unknown file extension. This pattern, together
matching a regular file extension type. with Read User File, represented the encryption behaviour
It is worth noting that it is entirely possible that apps such seen from the various crypto ransomware in our observation
as of games, etc. produce temporary file types with arbitrary phase. The sequence of tokens for this pattern starts with an
extensions, leading to potential false positives. However, it openat system call with the user directory specified, fol-
would be difficult to ascertain the extent of this as the num- lowed by a file created with an unknown file extension and a
bers would be dependent on the apps chosen to perform the write operation. Figure 3 provides an abstracted example
analysis. Furthermore, fairly and accurately evaluating the of our process for modelling the aforementioned malicious
use of temporary file occurrence would be challenging given behavioural patterns using regular expressions. We utilised a
similar process for Suspicious and General patterns.

123
C. J. W. Chew et al.

2.7.2 Suspicious patterns For File Read, and File Write, the sequence started with
an openat system call, then a read or write operation. The
The first suspicious pattern we noted was applications mak- last pattern Generic File Unlink matches any unlinkat sys-
ing connections to an external IPv4 address. This could mean tem call. During our observation phase, benign applications
the malicious app making connection to a C&C server, how- normally unlinked files, such as .flock, .xml, .bak, or .db-wal,
ever, this can also be a non-malicious app connecting to the which were files unrelated to the user. Hence, Generic File
outside internet. We therefore, classified as suspicious but Unlink focuses on these specific file extensions.
not malicious. The sequence of this pattern observes any
connect system call followed by an IPv4 address.
Another suspicious behavioural pattern was directory 3 Implementation with streaming system
searching. This behaviour is traditionally exhibited by crypto calls
ransomware, which searches for user files within the device to
encrypt. However, this behaviour does not inherently signify The previous section described an offline methodology for
malicious behaviour as there are benign applications that can detecting crypto ransomware utilising system call data. The
exhibit the same behaviour, such as cache-cleaning applica- main limitation of the approach is the offline data collec-
tions. The sequence consists of an openat system call and tion process, which is not scalable and not indicative of a
a directory name, then a sequence of getdents64 (system real-world scenario where data and information is constantly
call for getting directory entries), ending with a close. generated in real-time. We improved this through a new
The next notable suspicious pattern discovered in some streaming architecture, where each line of system call gen-
ransomware samples was the creation of an obfuscated file. erated by strace is processed in real-time. This approach
This file had no file extension and the content contained an consists of two primary modules, Process Token Module, and
external URL. Similar to the first suspicious pattern, we were Detection Module. Figure 4 provide an abstract overview of
unable to validate the legitimacy of the URL address. How- our proposed approach, with the following subsections fur-
ever, many of the ransomware logs observed, contained URL ther elaborating on each module.
addresses that were related to C&C servers. The sequence of
tokens for this pattern comprised an openat system call, 3.1 Process token module
then any obfuscated file name with no file extension, fol-
lowed by a pwrite64 operation with the contents matching To stream the system call data (i.e. capture the system call
any URL address. data in real-time), we used Android Debug Bridge (ADB)
The last suspicious pattern was the acquisition of net- and strace on an Android emulator running Android 7.0
work information via getaddrinfo. From our observa- Nougat (API level 24). The process observed using strace
tions, the majority of ransomware applications attempted to was the parent process (Zygote), which allows us to cap-
acquire network information, such as socket addresses, and ture a broad range of behaviours, such as the application’s
socket types from unknown domains via getaddrinfo. behaviour and application to Operating System (OS) inter-
However, this does not necessarily indicate malice as we actions occurring within the device. System calls produced
discovered legitimate trusted domains in benign applica- by strace are sent to the Process Token module, which
tions such as, googleadservices. This pattern began checks if it is a white-listed system call, then formats the
by matching a socket system call followed by the subse- system call with a separation character (;) and converts it
quent sequence of system calls: setsockopt, connect, into a unique token for the Detection Module. By adopting
fnctl64, fstat64, and concluding with a match for a a streaming approach, we were able to provide a more real-
URL address. istic, real-world, evaluation of our offline approach of using
system call behavioural patterns to detect crypto ransomware
2.7.3 General patterns in real-time.
Not all system calls recorded by strace are relevant
There are three patterns in the General category. These pat- to the behaviour of an application of interest. For example,
terns consist of simple file I/O operations, read and write clock_gettime() that periodically record the system
file behaviour, and generic file unlinking (targets known clock time and gettimeofday(), which can acquire the
file extensions in any directory location), such as tempo- current time and the timezone, irrespective of application
rary files (.tmp, _tmp), backup files (.bak), or File locks behaviour. We filtered out system calls following a similar
(.flock). method of filtering unrelated system calls from our offline
The patterns in the General category aim to provide more approach, which was mentioned in Sect. 2.4, to the stream-
detailed information regarding an application’s behaviour ing process. We improved this process by white-listing a
regardless of whether the application is malicious or benign. smaller subset of systems calls used for crypto ransomware

123
Real-time system call-based ransomware detection

Fig. 4 Block diagram for Process Token Module


streaming approach
System call
Reformat System
Strace Check white-list Convert to token
call

Layer 1 token
Detection Module

Layer 2 token
Advance layer 2 Advance layer 1
Check final state Check final state
FSMs FSMs

Malicious behaviour
detected

(e.g. open, write, read). Thus, providing a further reduction Table 2 Token representations of systems calls
in the processing and detection time. Tokens Text representation
It should be noted that in this work only a small subset
of system calls is observed as it enables us to utilise them O_U_CREATE Open create unknown file extension
more efficiently in Finite State Machines (FSM). Through the O_UDIR_FILE Open user file
analysis of system calls, we have observed that the incorpo- O_UD Open user directory
ration of additional system calls such as fstat and network RD Generic read
related calls does not significantly contribute to detection of O_OBF Open obfuscated filename
crypto ransomware at a system call-level. Our emphasis is on S Generic socket
a core set of system calls that have shown to be sufficient for W Generic write
identifying malicious behaviour based on initial observations O Generic open
described in Sect. 2.6. C_DQ Connect to dotted quad address
After the initial filtering process, each system call was for- SS Generic setsockopt
matted using the separation character ; for easier token con- FC_64 Generic fcntl64
version (e.g. <pid>; <timestamp>; <system call>; W_GA Write getaddrinfo
<arguments>), then converted into unique tokens to be FS_64 Generic fstat64
utilised by the FSMs (i.e. token FSMs) in the Detection Mod- GET_ENT64 Get entries in directory
ule. This was done to reduce the number of state transitions U_UDIR Unlinking file in user directory
required. The conversion process condensed each system call
G_U Generic unlink
into a unique token. To convert system calls into tokens,
PW_64 Generic pwrite64
we developed a set of unique tokens (provided in Table 2),
PW_64_AD Pwrite64 URL
derived from regular expressions, that matched each system
RN_UDIR Rename file in user directory
call based on the operation and system call arguments.

3.2 Detection module


patterns were converted into a more compact and generalised
The Detection Module utilises the behavioural patterns pre- FSM to reduce the time taken to detect behaviour. It needs
viously discussed in Sect. 2.7. These behavioural patterns are to be kept in mind that generalisations like this can increase
converted into token FSMs, which are used in our detection the likelihood of false positives.
phase. As each token is streamed from the Process Token Crypto ransomware follows a distinct and common
module, the Detection module validates the current token sequence of behaviours. Hence, to further distinguish the dif-
against a set of FSMs. In this module, the proposed method ferences between malicious and benign behaviours we have
includes two layers of finite state machines to acquire a more devised a second layer of FSMs, which determines if the
precise detection model for crypto ransomware. Suspicious sequence of matched patterns corresponds to the sequence
and General patterns were not used in the Detection Module of behavioural patterns exhibited by crypto ransomware. The
except for Directory Search, as those patterns did not pro- second layer of FSMs represents the sequential occurrence
vide additional benefits in the process of detecting malicious of behaviours observed in crypto ransomware (i.e. combina-
activity with this proposed implementation. tion of layer 1 FSMs). The second layer FSM will only be
The first layer of FSM consists of individual behavioural checked if the first layer FSM matches a pattern (i.e. a layer
patterns previously mentioned in Sect. 2.7. These behavioural 1 FSM has reached a final state). The state transition of a

123
C. J. W. Chew et al.

behaviours. To acquire the specific sequences of behaviours,


we randomly selected six sample from six different ran-
somware family (one sample from each family) and manually
observed the sequence of layer 1 FSMs detected. From this
observation, we acquired 4 distinct sequences of behaviours
commonly exhibited by crypto ransomware as shown in
Table 3. The table shows the four distinct sequences of
behaviours; the symbol > is used to show the concatena-
tion of individual behaviours(e.g. Directory Search >
Unlink User File means a directory search behaviour
followed by another behaviour, which unlinks user files). If
one of these sequences is discovered in the 2nd layer of FSMs,
the application is considered malicious. Figure 6 shows an
example of a layer 2 FSM.
The streaming approach described in this section addresses
the limitations of the previous offline approach by estab-
lishing an improved processing and detection system. This
approach adopted the previously defined behavioural pat-
Fig. 5 Transformation of behavioural patterns to layer 1 token FSM terns, and created a real-time detection system utilising a 2
layer FSM, which observed individual behavioural patterns
and sequences of behavioural pattern, thus further validat-
layer 2 FSM is the layer 1 FSM behavioural pattern name
ing the first half of our fourth research objective. In the
(e.g. Unlink user file, General unlink).
following section, we evaluate the improvements of this
streaming implementation compared to the previously estab-
3.2.1 Creation of layer 1 FSMs lished offline approach.

Layer 1 FSMs are based on previously discovered crypto


ransomware behavioural patterns. However, as mentioned 4 Evaluation
in Sect. 3.2 they were generalised and compacted through
the utilisation of tokens. To acquire the token FSMs, we In this section we present the results of our comparison
simplified the expanded regular expressions by removing between the streaming implementation, which observed sys-
fine-grain details, such as timestamps, newline matches (\n), tem calls in real-time and utilised a two layer FSM approach
and multi-line matches (((.|\n)*?)) as these matches to detect behavioural patterns, and the offline approach,
were no longer due to the real-time streaming approach, which observed system call logs to detect behavioural pat-
which processes one token at a time rather than iterating terns. Our process of acquiring the ransomware dataset, the
over multiple lines of system calls. After this simplifica- methods used to evaluate our approaches, and the results of
tion, the system calls and their respective arguments used our experimentation, which consisted of detected malicious
in the regular expression were converted into a unique token patterns, false positives within benign applications, and the
as previously explored in Sect. 3.1. Through this process of overhead incurred by the streaming approach.
generalisation and compaction, we acquired tokenised FSMs. The environment used in our evaluation was running
Figure 5 shows an example of this process, which takes the MAC-OS, Intel Core i5 2.3 GHz Quad Core, with 8GB RAM.
offline tokenised regular expression and expands it to the The Android emulator was created using Android Studio, and
full regular expression. This is done to remove the fine-grain the emulator environment was a Pixel 2 running API level 24,
details, thus resulting in a more compact regular expression. Android 7.0 (Google APIs), with 2048 MB internal storage,
After removing the fine-grain details, the regular expression 512 MB SDCard storage, and 1536 MB of RAM.
is converted into a unique token, which is then created into
a layer 1 token FSM. 4.1 Dataset acquisition

3.2.2 Creation of layer 2 FSMs To acquire the dataset of crypto ransomware samples,
we retrieved the hash or package name publicised from
Layer 2 FSMs focus on behaviour sequences (i.e. sequence of established anti-virus vendors, such as Avast [10] and
behavioural patterns from layer 1 FSMs). As previously men- ESET [67], and relevant search tags, such as family name
tioned, crypto ransomware exhibited distinct sequences of from Koodous [36]; then we manually verified each mali-

123
Real-time system call-based ransomware detection

Table 3 Sequence of common behaviours exhibited by crypto ransomware


Pattern name Behaviour sequence

Search read unknown create write Directory search > read user file > unknown file creation > write unknown file
extension
Search read unlink Directory search > read user file > unlink user file
Search unknown create write Directory search > unknown file creation > write unknown file extension
Search unlink Directory search > unlink user file

applications were tested separately with manual interaction


to ensure we captured the cleaning process.

4.2 Evaluation method


Fig. 6 Layer 2 FSM example for search Unlink
To evaluate the offline approach, we ran each application for
two minutes using our automation script. This automation
cious application against VirusTotal [59] before downloading script installs and starts the applications and utilises Android
the APK from Koodous [36]. As our focus was crypto Monkey [27] to inject random events to simulate real user
Android ransomware, it was difficult to acquire a large sam- interaction. Once all the system calls were extracted, we put
ple size due to the distinctive category. Nonetheless, we them through our detection program, and calculated the num-
managed to acquire 500 distinct samples. Out of that set, ber of all detected patterns for the different severity levels.
213 applications exhibited crypto ransomware behaviours. A similar method was utilised for our streaming approach.
Applications that did not encrypt our files were manually However, rather than collecting system call logs, we piped the
re-evaluated to examine the potential cause of failure. From output of strace into our implementation and measured the
the re-evaluation, we discovered 18 samples required man- number of layer 2 FSM matches (i.e. sequential behavioural
ual interaction to enable the encryption component. These patterns). We identified various malicious patterns for all six
18 samples are inclusive of the 213 samples. ransomware families. Any application containing a match for
From our observations via manual re-evaluation, we at least one malicious pattern, for the offline approach or one
noticed several factors that caused the failure of encryption. layer 2 FSM match, for the streaming approach, was classi-
Some of the samples required a connection to a C&C server fied as malicious. Any falsely identified malicious patterns
that was no longer active. Additionally, some of the applica- were noted within this evaluation.
tions crashed upon start-up, thus, preventing the malicious This section details our evaluation of the six different
code from executing. Furthermore, there were applications crypto ransomware families. Figure 7a, shows the individ-
that failed to install on the emulator due to issues, such as a ual malicious patterns detected in the offline approach and
missing manifest file. Fig. 7b shows the sequence of malicious patterns detected
As part of our contribution, we produced a dataset of sys- using the streaming approach. Although different patterns
tem call logs collected from our evaluation of 213 crypto were utilises in the detection process (offline uses individ-
ransomware.1 We hope this will enable others working on ual behavioural patterns, whereas streaming uses sequences
system call-based pattern detection to evaluate their own of malicious behavioural pattern), the two figures indi-
approaches, or expand and develop new behavioural patterns cate a similar outcome in detected behavioural patterns for
from their own observations. crypto ransomware. This similarity shows that the streaming
Alongside our malicious dataset of crypto ransomware, approach with an altered detection method, using sequence
we acquired 502 benign applications from APKPure [8] to of behavioural patterns, is capable of successfully identifying
evaluate the efficacy of our approach. Two of these samples shared common behavioural patterns in crypto ransomware
were cache cleaning applications. These two special sam- and is comparable to our offline approach.
ples were included as these types of applications closely One of our research objectives was to evaluate the feasibil-
resembled the high-level behaviours of crypto ransomware, ity of the devised patterns for behavioural pattern detection
specifically the behaviour of removing user files. These two against a set of crypto ransomware. The overall results of our
evaluation in Fig. 7a, b, provide visible indication of shared
common behaviour among crypto ransomware regardless
1 As this dataset consists of active ransomware samples, access can be of the family. The only exception is of WipeLocker, which
granted upon request through vimal.kumar@waikato.ac.nz. demonstrates a singular behavioural pattern. WipeLocker is

123
C. J. W. Chew et al.

Offline: Malicious Patterns Detected


100% Streaming: Malicious Patterns Detected
100%
90%
90%
80% 80%
70% 70%
60% 60%
50% 50%
40% 40%
30% 30%
20% 20%
10% 10%
0% 0%
Filecoder Pletor WannaLocker SimpleLocker WipeLocker Black Rose Lucy Filecoder Pletor Wannalocker Simplelocker Wipelocker Black Rose Lucy

Writing to unknown file extensions Reading of user files Search Read Unknown Create Write Search Unlink
Search Read Unlink Search Unknown Create Write
Files created with unknown extensions Unlinking user files

(a) Offline: Malicious behaviour results (b) Streaming: Sequence of malicious behaviour results

Fig. 7 (a) Offline: malicious behaviour results. (b) Streaming: sequence of malicious behaviour results

known to only remove user files, without encrypting them. system calls with the flag O_CREAT could be excluded.
Although there have been different classifications for Wipe- This would ensure that only user created files were cap-
Locker [12], we chose to classify this specific family as a tured within this pattern.
crypto ransomware based on the observed system behaviour – The third benign application that was falsely classified
(unlinking files) rather than the user perceived behaviour, incorrectly matched the patterns Unlinking User Files
such as ransom notes or displaying a perpetual window, and Read User File, due to the application creating and
which may result in a different classification. Further, in our utilising temporary files within the user directory. This is
evaluation, we were unable to find any match for the Rename one of the drawbacks of capturing high-level behaviour.
& Unlink File pattern as this behaviour was likely tied to a In most cases, these patterns would capture unlinking
specific variant of WannaLocker. of user created files and existing user file access and
The results shown in this evaluation have validated the reads, which is a behaviour, often exhibited by crypto ran-
feasibility of our discovered malicious behavioural patterns somware as part of the file encryption process. However,
for detection of crypto ransomware. Additionally, we have in the case of an application creating and utilising a file
shown the feasibility of our streaming approach for detecting within the user directory, it would be classified as a false
malicious patterns by achieving similar successful results to positive. A potential solution is to exclude files created
our offline approach. by the application within the user directory, as previously
suggested, or reduce and combine the behavioural pat-
4.3 Benign applications test terns related to file encryption.
– The last three benign applications falsely classified were
We tested both approaches on a dataset consisting of 502 incorrectly matching two behavioural patterns: Unknown
benign applications. Two of the benign applications were File Ext Created and Write File Unknown Extension.
cache-cleaning applications, which are discussed in a sep- These patterns were falsely classified due to the appli-
arate section. In the following subsections, we explain the cations creating an application folder within the user
results of our experiments. directory and a file with an unknown file extension within
the application folder. Similar to the proposed solution
4.3.1 Offline method for the third application, combining behavioural patterns
related to file encryption could provide a more accurate
Out of the 500 benign applications (excluding the 2 cache representation. Alternatively, the pattern could be altered
cleaning apps), we encountered six falsely classified appli- to only check for primary directories (i.e. directories not
cations. This was due to a mismatch of four different patterns, created by the application), such as photographs, docu-
specifically, Unlinking User Files, Read User File, Unknown ments, and downloads.
File Ext Created, and Write File Unknown Extension.
We further extended this evaluation on our streaming
– Two applications incorrectly matched Read User File; approach by utilising the same dataset. However, we applied
this was due to the applications creating and reading incremental changes to refine the patterns. This is further
application related files within the user directory, such elaborated in the next section.
as dslv_state.txt. To mitigate this issue, openat

123
Real-time system call-based ransomware detection

4.3.2 Streaming method Table 4 Summary of all benign applications evaluated using offline
approach
Our initial streaming approach contained one layer of FSMs Benign samples Percentage Absolute number Sample size
where each pattern represented a behaviour, similar to the
True negative 98.6% 495 502
offline approach. As we evaluated this initial design on our
False positive 1.4% 7
benign dataset, we encountered 2.2% (11 out of 500) false
positives and 100% true positives. To help alleviate the false
positives, we applied a second layer of FSM as mentioned in
Sect. 3.2.2, which captured the sequence of behaviours. our detection. Hence, we utilised this methodology in our
After re-evaluating with the inclusion of layer 2 FSM, we detection system.
encountered a much higher false positive rate of 4.2% (21 Utilising the Altered Directory Search method, two false
out of 500) with unchanged true positive rates. The increase positives were detected. These two false positives consisted
in false positive rate was caused by the combination of the of search_unlink sequences. This was likely caused
suspicious pattern directory search and unlinking user file, by the applications accessing the same user directory multi-
which was present in 17 out of 21 of falsely classified benign ple times (i.e. Android directory) and unlinking application
applications. This issue occurred because the initial directory related files. As the systems calls were abstracted into tokens,
search pattern matched all folders within the user directory. the detection system was unable to identify fine-grain details,
This included the Android folder where application spe- such as different user directories being accessed (i.e. if the
cific files were stored. The unlinking user file pattern also same user folder was accessed twice, it would be consid-
had the same issue where any file within the user directory ered a directory search pattern). This is one of the known
was considered a match. To alleviate this issue we restricted limitations of our proposed streaming approach.
the Directory Search pattern to exclude the Android folder.
This alteration significantly reduced the false positive rate to 4.3.3 Cache-cleaning applications
1% (5 out of 500) whilst retaining the 100% true positive
rate. As previously detailed in Sect. 2.7.1, specific benign appli-
This method, however, can potentially produce false neg- cations, such as cache-cleaning applications could produce
atives, as applications may store valuable data for the user behaviours, which can potentially be deemed as malicious
within the application specific folders or users can also store if the context is not know (e.g. unlinking junk files within
their own files within the folder. To observe this, we tested the the user directory). Hence, we separately evaluated two
new pattern on 6 different crypto ransomware (from different cache-cleaning applications to evaluate the efficacy of our
families). Each sample was observed for 5 min in an emulated approaches. By utilising the offline methodology mentioned
environment with trap files stored within the Android direc- in Sect. 2.4, one of the cache-cleaning application resulted
tory. In this test, 5 out of 6 ransomware encrypted the files in a false positive. There were four total malicious patterns
within the Android folder except for Wannalocker, which matched and all four of those patterns were linked to Read
did not encrypt files within the Android folder. These results User File. From the examination of the patterns file and sys-
posed an issue as the exclusion of the Android folder lim- tem call log file, these four patterns were reading the contents
ited the scope of our detection process. of the user created files (i.e. pre-existing files, not created
To mitigate this issue without compromising on the detec- by the application), which would be deemed as malicious
tion rate, we observed the differences in behaviour between behaviour as it is unusual for most benign application to be
benign and crypto ransomware, specifically the behaviour of reading the contents of user created files.
directory search. We noticed that with crypto ransomware, Table 4 contains a summary of our results, which utilised
a directory search occurred for multiple folders within the the offline approach. The Percentage column shows the per-
user directory to ensure a widespread effect. However, for centages of true negatives and false positives detected for all
benign applications this search was less frequent, except for benign samples evaluated. The Sample Size column denotes
specific applications, such as cache-cleaning applications. To the numerical value of true negatives and false positive sam-
evaluate this theory, the directory search pattern was altered ples detected, while Table 5 provides an overview of the
to detect directory searches that occurred two or more times true negatives and false positives of 502 benign applications
in separate directories. With this alteration, the false posi- for the streaming approach with the 4 aforementioned alter-
tives rates were reduced to 0.4% (2 out of 500) with 100% ations. Additionally, the evaluation results for cache-cleaning
true positives. This was a 250% reduction in false positives application have also been included.
compared to the methodology of excluding Android direc- We can see that the false positive rates of our stream-
tory without compromising on the scope, and accuracy of ing approach have noticeably improved (using the Altered
Directory Search method) compared to the offline approach.

123
C. J. W. Chew et al.

Table 5 Summary of benign


Methodology True negative False positive Sample size
evaluation with the streaming
approach using aforementioned Layer 1 evaluation 489 11 500
methods
Layer 2 evaluation 479 21 500
Restricting user directory 495 5 500
Altered directory search 498 2 500
Incl. cache-cleaning application 498 4 502
Bold is to highlight the most significant result (in terms of highest accuracy) achieved in that specific evaluation.

This was due to the introduction of a layer 2 FSM, which We conducted another evaluation to assess the efficacy
observed sequences of behaviours, thus further distinguished of our streaming approach by measuring the number of sys-
the differences between a benign and malicious applica- tem calls that can be processed per second (i.e. throughput).
tion behaviour. Additionally, based on our observations, we In order to do this, we observed 10 random benign samples
made incremental alterations to the patterns based on the for 120 s and measured the average CPU time (usertime +
behaviours exhibited by benign and malicious applications systemtime) of all samples. We then acquired the average
to identify the best-fit method for our approach. The false number of system calls generated from all samples and com-
positive rates show that detecting ransomware and malware puted the number of system calls that can be processed
in general through behaviours exhibited in system calls is by our streaming approach per second (i.e. Throughput =
feasible. Numberofsystemcalls/CPUtime). The throughput produced
from our streaming approach can be compared to the
4.4 Performance evaluation number of system calls that can be produced by the appli-
cation over 120 s (i.e. Application run − time throughput =
A critical aspect of such a detection system is the time it Number of system calls/120 s) to determine the feasibility of
takes to detect malicious activity, which affects its feasibility our approach. In our experiment, we found that the average
in a real-world environment. We tested both our offline and number of system calls generated from our applications over
streaming approaches on this aspect. 120 s was 13, 4020 ± 96, 078, and the average CPU time
To evaluate the pattern matching time, we executed a mali- for our streaming approach was 17.57s ± 12.975s. From
cious ransomware variant 10 times on each approach for these two values, the calculated throughput of our stream-
120 s. For the offline approach, the log file was recorded ing approach was 7628 system calls/s. In comparison to the
once. However, the detection component was executed 10 number of system calls produced by the application over
times on the same log file. This was done to ensure con- 120 s, which is 1117 system calls/s, the results indicate that
sistent results. Table 6 shows a summary of our results. our proposed streaming approach is feasible, as it is capa-
Offline indicates the offline approach, Single Match repre- ble of processing more system calls than an application can
sents individual behaviours matched (i.e. layer 1 FSM), and generate.
Sequential Match is the combination of individual behaviours
matched in sequential order (i.e. layer 2 FSM) in the stream- 4.5 Discussion
ing approach. To calculate the Offline time, we measured
the average time taken to match a pattern using the regu- In this section we discuss some of our observations as well
lar expression. For Single Match and Sequential Match, we as experiences.
measured the average time from the first transition to the As established by now, we were observing the behaviour
last transition of the FSMs (both layer 1 and layer 2, respec- of crypto ransomware on Android operating system. In order
tively). It should be noted that the time to label an application to do this we needed to acquire and then execute the ran-
as a ransomware is the average time defined in sequential somware samples on our VM. The process of acquiring and
matches. For example, it will take approximately 0.335 s validating these samples was very time-consuming as each
to determine if a running application exhibited a malicious downloaded sample had to be manually checked against
Unlink User File pattern, therefore labelling the application VirusTotal [59] to ensure that the malware was of a crypto
as a ransomware. As can be seen, the pattern matching times ransomware family. Crypto ransomware that executes on
in the streaming approach are significantly lower compared Android is a subset of all the crypto ransomware which lim-
to the offline approach. This was due to the change in the ited the number of samples we could collect. Since we needed
design of the architecture by introducing a tokenised FSM the ransomware to actually execute, this further limited the
approach, which retained the current state without the intri- number of samples that we could use, because a large number
cacies of regular expression matching. of samples we collected did not execute. Of the 500 samples

123
Real-time system call-based ransomware detection

Table 6 Average detection time


Pattern name Offline (s) Single Match (s) Sequential Match (s)
for individual patterns in
seconds Unlink user file 0.623 ± 0.0081 0.026 ± 0.0171 0.335 ± 0.0908
Unknown file ext 0.670 ± 0.0076 0.175 ± 0.3068 0.406 ± 0.1437
Read user file 0.738s ± 0.0079 0.021 ± 0.0112 0.384 ± 0.0897
Write to unknown file ext 0.661 ± 0.0027 0.024 ± 0.0120 0.454 ± 0.0982

we collected, 213 exhibited crypto ransomware behaviour. of our dataset to counteract the aforementioned issues and
The remaining 287 samples could not be utilised due to one concerns.
of the following reasons, It needs to be noted that the intention of this work
was the creation of FSMs models and behavioural pat-
terns, which currently require manual observation and human
– The application not executing due to missing manifest interaction. This often makes the process time-consuming
files and difficult. For our future work, we intend to further
– The application not executing due to incompatible develop our approach by automating the process of identify-
Android versions ing behavioural patterns and FSM creation, thus alleviating
– The applications not exhibiting crypto ransomware behaviour the requirement of human interaction and enable us to create
– The application requiring a connection to C2 server a fully automated self-protecting system. Additionally, as all
experiments were conducted in an emulated environment, the
performance evaluation results while indicative of acceptable
Static and code analysis techniques that only consider the performance do not truly reflect a real-world implementa-
executable file(s) and don’t need to execute the ransomware tion. In the current state-of-the-art, the implementation of
do not generally face these issues. As a result of this lim- such a system is a challenging problem due to the require-
itation, we acknowledge that our models could potentially ment of root privileges, and structure of the Android system.
lead to the issue of an overfitted solution due to the low mali- However, in future, if the acquisition of system calls were
cious sample size. However, the samples that we did collect more easily accessible, we intend to implement the stream-
covered the vast majority of crypto ransomware samples on ing approach on a real user device.
Android devices; although limited, we believe this is close An astute reader would also make the observation that the
to the extent of the current Android crypto ransomware that sequence of events in the layer 2 FSM are allowed to occur in
we can obtain through publicly accessible and legal means. any order except for the last detected behaviour, thus result-
As mentioned in Sect. 2.2, we focus specifically on ing in a partial shuffling of events. This provides flexibility in
crypto ransomware as it is more prevalent and destructive the detection process. However a potential limitation of this
compared to locker-type ransomware. The system call-level partial shuffle is the last event in a layer 2 FSM, which always
behaviour of locker type ransomware is different from crypto occurs in the same order (e.g. Search Read Unlink =
type. We therefore, do not believe it would be feasible Directory Search OR Read User File > Read
to accurately detect locker-type ransomware using the cur- User File OR Directory Search > Unlink User
rent behavioural implementation without further significant File). Even though our evaluation for detecting crypto
adjustments. While the issue of the limited number of sam- ransomware was successful, there is potential for false nega-
ples in the dataset can be addressed by observing more tives if a malicious application exhibits a malicious sequence
malware types, as this work focuses on crypto ransomware, of behaviour, which does not match the last occurring
the behavioural patterns were specifically designed to only behaviour. In future, we would like to expand this work by
capture crypto ransomware. Different malware types are utilising a full shuffle approach or a fixed sequence of occur-
likely to exhibit stark differences in behavioural patterns ring events and compare the differences in detection rates.
at a system call level. Hence, it would not be feasible to While our proposed approach is capable of achieving good
achieve a fair comparison in the classification process for detection rates, there are potential improvements that can be
discriminating malware and crypto ransomware as potential implemented to develop a more robust detection system. As
matches would be coincidental. This issue can be alleviated previously mentioned in Sects. 2 and 2.3.1 the use of static
by further extensive evaluation to understand the underly- analysis is also valuable and modern anti-malware systems
ing behavioural patterns for each malware type. As part of use a hybrid approach. Our dynamic analysis-based approach
our future work, we aim to explore the adjustments required can determine whether an application is malicious or benign,
and broaden our approach to include other types of malware, however, it has a small but nonzero detection time which
such as trojans, and spyware or introduce different variants

123
C. J. W. Chew et al.

would mean a small amount of data would be encrypted even is not included in the article’s Creative Commons licence and your
in the case of a successful detection. Therefore, while we intended use is not permitted by statutory regulation or exceeds the
permitted use, you will need to obtain permission directly from the copy-
believe that our approach is successful, a complete and practi- right holder. To view a copy of this licence, visit http://creativecomm
cal anti-ransomware system will additionally include a static ons.org/licenses/by/4.0/.
analysis-based approach to identify known ransomware. The
inclusion of static analysis provides reliability. Hence, in
future, an interesting avenue to explore is to employ the use
of static analysis in our proposed method to develop a more References
robust and reliable detection approach.
1. Alam, S., Horspool, R., Traore, I., Sogukpinar, I.: A framework
for metamorphic malware analysis and real-time detection. Com-
put. Secur. 48(C), 212–233 (2015). https://doi.org/10.1016/j.cose.
5 Conclusion 2014.10.011
2. Alon, U., Zilberstein, M., Levy, O., Yahav, E.: code2vec: learning
distributed representations of code. Proc. ACM Program. Lang.
In this work, we have described and evaluated a behaviour- 3(POPL), 1–29 (2019)
based ransomware detection method. We first identified sys- 3. Al-Rimy, B.A.S., Maarof, M.A., Shaid, S.Z.M.: Ransomware
tem call-level behavioural patterns for crypto ransomware. threat success factors, taxonomy, and countermeasures: a sur-
vey and research directions. Comput. Secur. 74, 144–166
We presented our methodology for collecting and identify- (2018). https://doi.org/10.1016/j.cose.2018.01.001. https://www.
ing behavioural patterns at a system call level. Using this sciencedirect.com/science/article/pii/S016740481830004X
methodology, we were able to discover 12 common high- 4. Al-rimy, B.A.S., Maarof, M.A., Shaid, S.Z.M.: Ransomware threat
level behavioural patterns at a system call level. We then success factors, taxonomy, and countermeasures: a survey and
research directions. Comput. Secur. 74, 144–166 (2018)
evaluated the effectiveness of the behavioural patterns we 5. Alzahrani, N., Alghazzawi, D.: A review on android ransomware
had identified. This was achieved by evaluating them against detection using deep learning techniques. In: Proceedings of the
a set of crypto ransomware to identify shared commonali- 11th International Conference on Management of Digital EcoSys-
ties between different families using pattern matching. We tems, pp. 330–335. Association for Computing Machinery, New
York (2019)
have also made our dataset of formatted system calls pub- 6. Amer, E., El-Sappagh, S.: Robust deep learning early alarm pre-
licly available. We then improved upon our initial approach to diction model based on the behavioural smell for android malware.
detect crypto ransomware in real-time using a 2-layer token- Comput. Secur. 116, 102670 (2022). https://doi.org/10.1016/j.
based finite state machine streaming approach. Finally, we cose.2022.102670
7. Andronio, N., Zanero, S., Maggi, F.: Heldroid: dissecting and
analysed the performance of our approach to demonstrate detecting mobile ransomware. In: Proceedings of the 18th
that our ransomware detection system can run on an Android International Symposium on Research in Attacks, Intrusions,
operating system with acceptable overhead. and Defenses, RAID 2015, vol. 9404, pp. 382–404. Springer,
Berlin, Heidelberg (2015). https://doi.org/10.1007/978-3-319-
Funding Open Access funding enabled and organized by CAUL and 26362-5_18
its Member Institutions. 8. APKPure. Download APK on Android with Free Online APK
Downloader - APKPure. https://apkpure.net/. Accessed 21 Feb
Research Data Policy and Data Availability Statements The dataset 2024
used in this article is available on request by contacting vimal.kumar@ 9. Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A., Klein, J.,
waikato.ac.nz. Le Traon, Y., Octeau, D., McDaniel, P.: Flowdroid: precise context,
flow, field, object-sensitive and lifecycle-aware taint analysis for
Android apps. In: ACM Sigplan Notices, vol. 49, pp. 259–269.
Declarations ACM, Association for Computing Machinery, Edinburgh (2014)
10. Avast Blog. https://blog.avast.com/. Accessed 21 Feb 2024
11. Bhandari, S., Panihar, R., Naval, S., Laxmi, V., Zemmari, A., Gaur,
Conflict of interest The authors declare that they have no known com- M.S.: Sword: semantic aware android malware detector. J. Inf.
peting financial interests or personal relationships that could have Secur. Appl. 42, 46–56 (2018)
appeared to influence the work reported in this article. 12. Chen, J., Wang, C., Zhao, Z., Chen, K., Du, R., Ahn, G.J.: Uncover-
ing the face of Android ransomware: characterization and real-time
Ethical approval The authors declare that this article does not contain detection. IEEE Trans. Inf. Forens. Secur. 13(5), 1286–1300 (2017)
any studies involving human participants or animals. 13. Chew, C.J.W., Kumar, V., Patros, P., Malik, R.: Escapade:
encryption-type-ransomware: system call based pattern detection.
Open Access This article is licensed under a Creative Commons In: Kutyłowski, M., Zhang, J., Chen, C. (eds.) Network and System
Attribution 4.0 International License, which permits use, sharing, adap- Security, pp. 388–407. Springer, Cham (2020)
tation, distribution and reproduction in any medium or format, as 14. Compton, R., Frank, E., Patros, P., Koay, A.: Embedding java
long as you give appropriate credit to the original author(s) and the classes with code2vec: improvements from variable obfuscation.
source, provide a link to the Creative Commons licence, and indi- In: Proceedings of the 17th International Conference on Mining
cate if changes were made. The images or other third party material Software Repositories, MSR ’20, pp. 243–253. Association for
in this article are included in the article’s Creative Commons licence, Computing Machinery, New York (2020). https://doi.org/10.1145/
unless indicated otherwise in a credit line to the material. If material 3379597.3387445

123
Real-time system call-based ransomware detection

15. Bansal, U.: A review on ransomware attack. In: 2021 2nd Interna- Conference on Computational Intelligence and Security, pp. 1011–
tional Conference on Secure Cyber Computing and Communica- 1015. IEEE Computer Society, Sanya (2011)
tions (ICSCCC), pp. 221–226. IEEE Computer Society, Jalandhar 34. Kanwal, M., Thakur, S., Lashkari, R.: An app based on static analy-
(2021). https://doi.org/10.1109/ICSCCC51823.2021.9478148 sis for android ransomware. In: 2017 8th International Conference
16. Enck, W., Gilbert, P., Han, S., Tendulkar, V., Chun, B.G., Cox, L.P., on Computing. Communication and Networking Technologies
Jung, J., McDaniel, P., Sheth, A.N.: TaintDroid: an information- (ICCCNT), pp. 1–6. IEEE Computer Society, Delhi (2017)
flow tracking system for realtime privacy monitoring on smart- 35. Kok, S., Abdullah, A., Jhanjhi, N., Supramaniam, M.: Ran-
phones. ACM Trans. Comput. Syst. (TOCS) 32(2), 5 (2014) somware, threat and detection techniques: a review. Int. J. Comput.
17. Faruki, P., Bharmal, A., Laxmi, V., Ganmoor, V., Gaur, M.S., Conti, Sci. Netw. Secur. 19(2), 136 (2019)
M., Rajarajan, M.: Android security: a survey of issues, malware 36. Koodous: Malicious dataset (n.d.). https://koodous.com/
penetration, and defenses. IEEE Commun. Surv. Tutor. 17(2), 998– 37. Kruegel, C., Mutz, D., Valeur, F., Vigna, G.: On the detection of
1022 (2014) anomalous system call arguments. In: Snekkenes, E., Gollmann,
18. Faruki, P., Laxmi, V., Bharmal, A., Gaur, M.S., Ganmoor, V.: D. (eds.) Computer Security—ESORICS 2003, pp. 326–343.
AndroSimilar: robust signature for detecting variants of Android Springer, Berlin, Heidelberg (2003)
malware. J. Inf. Secur. Appl. 22, 66–80 (2015) 38. Lashkari, A.H., Kadir, A.A., Taheri, L., Ghorbani, A.: Toward
19. Ferdous, J., Mahboubi, A.., Islam, Md.: A review of state- developing a systematic approach to generate benchmark android
of-the-art malware attack trends and defense mechanisms. malware datasets and classification. In: 2018 International Carna-
IEEE Access 11:121118-121141 (2023). https://doi.org/10.1109/ han Conference on Security Technology (ICCST), pp. 1–7. IEEE
ACCESS.2023.3328351 Computer Society, Montreal, Quebec, Canada (2018)
20. Gandotra, E., Bansal, D., Sofat, S.: Malware analysis and classifi- 39. Levin, D.V.: Strace (2020). https://strace.io/
cation: a survey. J. Inf. Secur. 05, 56–64 (2014). https://doi.org/10. 40. Lin, Y.D., Lai, Y.C., Chen, C.H., Tsai, H.C.: Identifying Android
4236/jis.2014.52006 malicious repackaged applications by thread-grained system call
21. Gazet, A.: Comparative analysis of various ransomware virii. J. sequences. Comput. Secur. 39, 340–350 (2013)
Comput. Virol. 6(1), 77–90 (2010) 41. Lockheimer, H.: Android and security [Blog post] (2012). https://
22. Gharib, A., Ghorbani, A.: Dna-droid: a real-time android ran- googlemobile.blogspot.com/2012/02/android-and-security.html
somware detection framework. In: Yan, Z., Molva, R., Mazurczyk, 42. Maggi, F., Matteucci, M., Zanero, S.: Detecting intrusions through
W., Kantola, R. (eds.) Network and System Security, pp. 184–198. system call sequence and argument analysis. IEEE Trans. Depend-
Springer, Cham (2017) able Secur. Comput. 7(4), 381–395 (2008)
23. Ghillani, D., Gillani, D.H.: A perspective study on malware 43. Maggi, F., Matteucci, M., Zanero, S.: Reducing false positives in
detection and protection, a review. (2023). https://doi.org/10. anomaly detectors through fuzzy alert aggregation. Inf. Fus. 10(4),
22541/au.166308976.63086986/v1. https://www.authorea.com/ 300–311 (2009)
users/506161/articles/585873-a-perspective-study-on-malware- 44. Maiorca, D., Mercaldo, F., Giacinto, G., Visaggio, C.A., Mar-
detection-and-protection-a-review. Accessed 21 Feb 2024 tinelli, F.: R-PackDroid: API package-based characterization and
24. Gonzalez, D., Hayajneh, T.: Detection and prevention of crypto- detection of mobile ransomware. In: SAC ’17: Proceedings of
ransomware. In: 2017 IEEE 8th Annual Ubiquitous Computing. the Symposium on Applied Computing, pp. 1718–1723. Associ-
Electronics and Mobile Communication Conference (UEMCON), ation for Computing Machinery (2017). https://doi.org/10.1145/
pp. 472–478. IEEE Computer Society, New York (2017) 3019612.3019793
25. Google: Android Debug Bridge (ADB) (2020). https://developer. 45. McConnell, D.: The current state of ransomware in today’s world
android.com/studio/command-line/adb and why the future is bleak (2017). https://www.cs.tufts.edu/comp/
26. Google: help protect against harmful apps with google play protect 116/archive/fall2017/dmcconnell.pdf
(2019). https://support.google.com/googleplay/answer/2812853? 46. Mehnaz, S., Mudgerikar, A., Bertino, E.: Rwguard: a real-time
hl=en detection system against cryptographic ransomware. In: Bailey,
27. Google: UI/application exerciser monkey (2020). https:// M., Holz, T., Stamatogiannakis, M., Ioannidis, S. (eds.) Research
developer.android.com/studio/test/monkey in Attacks, Intrusions, and Defenses, pp. 114–136. Springer, Cham
28. Guerra-Manzanares, A., Luckner, M., Bahsi, H.: Android malware (2018)
concept drift using system calls: detection, characterization and 47. Micro, T.: Behind the Android menace: malicious apps—
challenges. Expert Syst. Appl. 206, 117200 (2022). https://doi.org/ TrendLabs security intelligence blog [Blog Post] (2012).
10.1016/j.eswa.2022.117200 https://blog.trendmicro.com/trendlabs-security-intelligence/
29. Hou, S., Saas, A., Chen, L., Ye, Y.: Deep4MalDroid: a deep learning infographic-behind-the-android-menace-malicious-apps
framework for Android malware detection based on Linux ker- 48. Mohammad, A.H.: Ransomware evolution, growth and recommen-
nel system call graphs. In: 2016 IEEE/WIC/ACM International dation for detection. Mod. Appl. Sci. 14(3), 68–74 (2020)
Conference on Web Intelligence Workshops (WIW), pp. 104–111 49. Moser, A., Krügel, C., Kirda, E.: Limits of static analysis for mal-
(2016). https://doi.org/10.1109/WIW.2016.040 ware detection. In: 23d Annual Computer Security Applications
30. Hou, O.: A Look at Google Bouncer (2012). https://blog. Conference (ACSAC 2007), pp. 421–430. IEEE Computer Soci-
trendmicro.com/trendlabs-security-intelligence/a-look-at- ety, Miami Beach (2007)
google-bouncer/ 50. Onwuzurike, L., Mariconti, E., Andriotis, P., Cristofaro, E.D., Ross,
31. Hull, G., John, H., Arief, B.: Ransomware deployment methods G., Stringhini, G.: Mamadroid: detecting Android malware by
and analysis: views from a predictive model and human responses. building Markov chains of behavioral models (extended version).
Crime Sci. 8(1), 1–22 (2019) ACM Trans. Priv. Secur. (2019). https://doi.org/10.1145/3313391
32. Iannucci, S., Abdelwahed, S., Montemaggio, A., Hannis, M., 51. Oz, H., Aris, A., Levi, A., Uluagac, A.S.: A survey on ransomware:
Leonard, L., King, J.S., Hamilton, J.A.: A model-integrated evolution, taxonomy, and defense solutions. ACM Comput. Surv.
approach to designing self-protecting systems. IEEE Trans. Softw. (CSUR) 54(11s), 1–37 (2022)
Eng. 46(12), 1380–1392 (2018) 52. Pizzolotto, D., Fellin, R., Ceccato, M.: Oblive: seamless code
33. Isohara, T., Takemori, K., Kubota, A.: Kernel-based behavior anal- obfuscation for java programs and android apps. In: 2019 IEEE
ysis for android malware detection. In: 2011 7th International 26th International Conference on Software Analysis, Evolution and
Reengineering (SANER), pp. 629–633. IEEE (2019)

123
C. J. W. Chew et al.

53. Richardson, R., North, M.M.: Ransomware: evolution, mitigation 64. Statistica: global market share held by mobile operating systems
and prevention. Int. Manag. Rev. 13(1), 10 (2017) from 2009 to 2023, by quarter (2023). https://www.statista.
54. Robert Lipovský Lukáš Štefanko, G.B.: Labour party is lat- com/statistics/272698/global-market-share-held-by-mobile-
est victim of Blackbaud ransomware attack (2016). https:// operating-systems-since-2009/
www.welivesecurity.com/wp-content/uploads/2016/02/Rise_of_ 65. Sun, S., Fu, X., Ruan, H., Du, X., Luo, B., Guizani, M.: Real-time
Android_Ransomware.pdf behavior analysis and identification for android application. IEEE
55. Scalas, M., Maiorca, D., Mercaldo, F., Visaggio, C.A., Martinelli, Access 6, 38041–38051 (2018)
F., Giacinto, G.: On the effectiveness of system api-related infor- 66. Tam, K., Khan, S., Fattori, A., Cavallaro, L.: Copperdroid: auto-
mation for android ransomware detection. Comput. Secur. 86, 168– matic reconstruction of android malware behaviors. In: NDSS
182 (2019). https://doi.org/10.1016/j.cose.2019.06.004. https:// Symposium 2015, pp. 1–15. NDSS, San Diego (2015). https://doi.
www.sciencedirect.com/science/article/pii/S0167404819301178 org/10.14722/ndss.2015.23145. Annual Network and Distributed
56. Sekar, R., Bendre, M., Dhurjati, D., Bollineni, P.: A fast automaton- System Security Symposium (NDSS) ; Conference date: 08–02–
based method for detecting anomalous program behaviors. In: 2015 Through 11–02–2015
Proceedings 2001 IEEE Symposium on Security and Privacy. S 67. WeLiveSecurity: WeLiveSecurity (2020). https://www.
P 2001, vol. 1, pp. 144–155. IEEE, Oakland (2001). https://doi. welivesecurity.com/
org/10.1109/SECPRI.2001.924295 68. Wiśniewski, R.: Apktool (2021). https://ibotpeaches.github.io/
57. Skandylas, C., Khakpour, N.: Design and implementation of self- Apktool/
protecting systems: a formal approach. Fut. Gen. Comput. Syst. 69. Zhang, X., Breitinger, F., Luechinger, E., O’Shaughnessy, S.:
115, 421–437 (2021) Android application forensics: a survey of obfuscation, obfus-
58. Song, S., Kim, B., Lee, S.: The effective ransomware prevention cation detection and deobfuscation techniques and their impact
technique using process monitoring on android platform. Mob. Inf. on investigations. Forens. Sci. Int.: Digit. Invest. 39, 301285
Syst. 2016, 1–9 (2016). https://doi.org/10.1155/2016/2946735 (2021). https://doi.org/10.1016/j.fsidi.2021.301285. https://www.
59. Sood, G.: Virustotal: R client for the virustotal API. VirusTotal. R sciencedirect.com/science/article/pii/S2666281721002031
package version 0.2.1 (2017) 70. Zhou, W., Zhou, Y., Jiang, X., Ning, P.: Detecting repackaged
60. Sophos: the state of ransomware 2020 (2021). https://www. smartphone applications in third-party android marketplaces. In:
sophos.com/en-us/medialibrary/pdfs/whitepaper/sophos-state- Proceedings of the 2nd ACM Conference on Data and Application
of-ransomware-retail-2021-wp.pdf Security and Privacy, CODASPY ’12, pp. 317–326. Association
61. Sophos: the State of Ransomware 2023 (2023). https://www. for Computing Machinery, New York (2012). https://doi.org/10.
sophos.com/en-us/content/state-of-ransomware 1145/2133601.2133640
62. Srivastava, A., Lanzi, A., Giffin, J., Balzarotti, D.: Operating system
interface obfuscation and the revealing of hidden operations. In:
International Conference on Detection of Intrusions and Malware,
Publisher’s Note Springer Nature remains neutral with regard to juris-
and Vulnerability Assessment, pp. 214–233. Springer (2011)
dictional claims in published maps and institutional affiliations.
63. Statista: Forecast number of mobile devices worldwide from 2020
to 2025 (in billions). Statista (2021). https://www.statista.com/
statistics/218984/number-of-global-mobile-users-since-2010/

123

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy