0% found this document useful (0 votes)
10 views12 pages

ebUF Detection

The article discusses the use of extended Berkeley Packet Filter (eBPF) for monitoring network traffic and detecting ransomware, highlighting its advantages over traditional antivirus solutions. It emphasizes eBPF's capabilities in real-time virus detection, process monitoring, and the development of flexible security policies, while also addressing the challenges of implementation. The findings suggest that eBPF could significantly enhance cybersecurity measures against evolving malware threats, particularly in the context of ongoing cybercrime related to the Ukraine conflict.

Uploaded by

Sumit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views12 pages

ebUF Detection

The article discusses the use of extended Berkeley Packet Filter (eBPF) for monitoring network traffic and detecting ransomware, highlighting its advantages over traditional antivirus solutions. It emphasizes eBPF's capabilities in real-time virus detection, process monitoring, and the development of flexible security policies, while also addressing the challenges of implementation. The findings suggest that eBPF could significantly enhance cybersecurity measures against evolving malware threats, particularly in the context of ongoing cybercrime related to the Ukraine conflict.

Uploaded by

Sumit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Monitoring Ransomware with Berkeley Packet Filter

Danyil Zhuravchak1, Anastasiia Tolkachova1, Andrian Piskozub1, Valerii Dudykevych1,


and Nataliia Korshun2
1 Lviv Polytechnic National University, 12 Stepan Bandera str., Lviv, 79013, Ukraine
2 Borys Grinchenko Kyiv University, 18/2 Bulvarno-Kudriavska str., Kyiv, 04053, Ukraine

Abstract
The article delves comprehensively into employing the extended Berkeley Packet Filter
(eBPF) for monitoring network traffic, filtering system calls, and overseeing processes for
ransomware activity. The principles and architecture underlying this advanced
technology are explored, laying a solid foundation for developing robust mechanisms for
detecting and halting malware propagation across networks. The paper highlights
potential strategies for tracking viruses within traffic and evaluates this approach,
meticulously considering the security concerns and control mechanisms endowed by
eBPF. A notable section of the article is dedicated to a comparative analysis. Traditional
malware detection mechanisms are assessed alongside a program built on eBPF, offering
a clear, unbiased insight into their respective efficiencies and potential pitfalls. This
extensive comparison underscores the enhanced proficiency and security offered by
eBPF-based monitoring mechanisms, solidifying their stance as a formidable tool against
malware threats, including ransomware. The authors demonstrate the capability of an
eBPF-based monitoring system in delivering potent network defense against various
malware forms, including ransomware, presenting significant implications for antivirus
protection developers. This comprehensive exploration and presented findings are
pivotal for enhancing the overall security quotient of computer networks globally,
emphasizing the critical role of eBPF in contemporary network security paradigms. The
superior efficiency and security assurance offered by BPF reinforces its viability as a
pivotal technology for monitoring network traffic and safeguarding against pervasive
malware threats.

Keywords 1
eBPF, monitoring, cybersecurity, vulnerabilities, malware.

1. Problem Statement the substantial volume of network traffic,


necessitating alternative strategies.
In the modern era, as technology progressively The world of cybercrime is developing
impacts people’s lives, computer network rapidly, partly fueled by the ongoing conflict in
security has become a crucial concern. The Ukraine, which has led to the convergence of
increasing prevalence of potential hazards, cybercriminal groups from Russia and its
such as viruses, trojans, spyware, and various neighboring countries. Changes within
types of attacks, calls for the innovation of ransomware and other cybercrimes indicate
novel and effective approaches for identifying shifting priorities. Attacks on Ukraine were
and monitoring such risks [1–2]. constant before and during the invasion and
Conventional antivirus solutions that rely persist to this day [3–5].
on virus signatures have become inadequate A potential consequence of the ongoing war
due to the swift evolution of new threats and may involve a shift in the objectives of
cybercriminals from Russia and neighboring

CPITS-2023-II: Cybersecurity Providing in Information and Telecommunication Systems, October 26, 2023, Kyiv, Ukraine
EMAIL: danyil.y.zhuravchak@lpnu.ua (D. Zhuravchak); anastasiia.tolkachova.mkbst.2022@lpnu.ua (A. Tolkachova);
azpiskozub@gmail.com (A. Piskozub); valerii.b.dudykevych@lpnu.ua (V. Dudykevych); n.korshun@kubg.edu.ua (N. Korshun)
ORCID: 0000-0003-4989-0203 (D. Zhuravchak); 0000-0002-8196-7963 (A. Tolkachova); 0000-0002-3582-2835 (A. Piskozub); 0000-0001-
8827-9920 (V. Dudykevych); 0000-0003-2908-970X (N. Korshun)
©️ 2023 Copyright for this paper by its authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

CEUR Workshop Proceedings (CEUR-WS.org)

95
countries in two ways. Firstly, it is speculated Matheus S. Castanho, Racyus D. G.
that some of these criminals may have Pacífico, Elerson R. S. Santos, Eduardo
transitioned from profit-driven cybercrimes, P. M. Câmara Júnior, and Luiz F. M.
such as ransomware attacks, to active Vieira—consider the key concepts,
participation in military actions. Nonetheless, code, problems, and possible uses of
ransomware attacks persist in Ukraine even these technologies in various fields [9].
amidst the conflict. Additionally, active 2. The article “Creating Complex Network
Russian cybercriminals are broadening their Services with eBPF: Experience and
horizons by targeting the Global South, Lessons Learned” highlights the
focusing on countries in Asia and Latin authors” experience in creating
America while steering clear of critical complex network services using eBPF
infrastructure and vulnerabilities in NATO (extended Berkeley Packet Filter)
member states. This change in focus could be technology [10].
motivated by a desire to avoid incidents that 3. “Combining System Visibility and
might escalate tensions between Russia and Security Using eBPF” by Luca Deri,
NATO members. The long-term cybersecurity Samuele Sabella, and Simone Mainardi
ramifications of these infiltrations remain focuses on the use of eBPF (extended
uncertain [6–7]. Berkeley Packet Filter) technology to
Unaddressed concerns involve the conflict’s increase system visibility and security.
impact on safe spaces for cyber criminals and eBPF is a powerful tool for monitoring,
the future trajectory of the cybercrime analyzing, and manipulating network
ecosystem amid the Ukraine-Russia standoff. packets at the operating system kernel
Furthermore, there is a need for increased level [11].
research to understand emerging ransomware Improving methods of detecting and
trends in connection with the conflict. countering ransomware in real-time is an
One potential security solution involves the important issue in the field of cybersecurity.
use of Berkeley Packet Filter (BPF), a The use of eBPF can provide significant
technology that facilitates high-performance benefits and help to overcome certain
data packet filtering within networks. This shortcomings in the research on this issue.
article endeavors to explore the fundamental Considering the above-mentioned articles, the
principles of BPF, its capabilities, and its following research advantages can be
application for real-time virus detection and identified in the field of eBPF:
monitoring in computer networks [8]. • High processing speed: eBPF allows for
much faster processing of network
2. Analysis of Recent Research traffic and full real-time activity tracking
and Publications than more traditional user-space-based
analogs.
Research on ransomware detection and • More accurate attack detection: eBPF
counteraction methods includes both allows for the development of flexible
traditional signature and behavior-based and adaptive detection systems that can
methods and new approaches used for analyze many more network
program analysis, Security Information Event parameters, which helps to detect
Management Systems (SIEM), and network pathogenic activity more accurately in
traffic adjustment. the early stages of an attack.
Based on the literature review, the • Flexibility: eBPF allows you to integrate
following successes and failures of existing ransomware detection and prevention
methods can be identified: directly into the operating system
1. “Fast Packet Processing with eBPF and kernel, enabling deeper analysis of
XDP: Concepts, Code, Challenges, and network traffic and rapid application to
Applications” focuses on eBPF and XDP the latest types of attacks.
technologies that accelerate packet • Automatic security provisioning: eBPF
processing in network systems. The allows you to automate the detection
authors are Marcos A. M. Vieira,

96
and counteraction process based on the security are static analysis and dynamic
solutions found in real-time. analysis.
Disadvantages: Static analysis refers to examining virus
• Development complexity: Utilizing eBPF program code without execution, which
to develop ransomware detection and involves analyzing hashes, and strings, or
countermeasures can be a complex employing machine learning for malicious
process that requires in-depth code classification. However, this approach
knowledge of eBPF and network security. may be less effective against viruses using code
Close collaboration and knowledge obfuscation techniques. Static analysis
sharing between cybersecurity determines file characteristics, such as file type
development teams is necessary to and specific lines in the file. Antivirus
ensure successful implementation. researchers gather multiple malware family
• Hardware limitations: Effective variants, identify common static features, and
implementation of eBPF may depend on create signatures. Signatures may contain
the availability of modern hardware, hashes of certain file areas, properties, sizes,
including smart network adapters, etc. As strains often exhibit static variation,
which may still be expensive or difficult antivirus products must update their
to acquire and deploy. signatures frequently.
• Lack of research in specialized and Dynamic analysis, a method that observes
specific contexts: In the context of real- virus behavior by executing them in controlled
time ransomware detection and environments like sandboxes, can detect
countermeasures, the increased viruses employing code obfuscation. However,
adoption of eBPF is a relatively new area it is more resource-intensive and time-
of research, which may require even consuming compared to static analysis. Also
more research work to implement and known as behavioral analysis, dynamic analysis
evaluate its effectiveness in different reveals the actions of malicious code or the
contexts and environments. system changes when executing such code.
Based on these advantages and While each method has pros and cons and lacks
disadvantages, it can be concluded that eBPF 100% ransomware protection, eBPF technology
has potentially significant application potential was chosen to address detection and combat
in detecting and countering ransomware in issues. By tracking system calls at the OS kernel
real-time. However, to obtain the best results level, eBPF provides profound insights into
for a variety of scenarios and environments, a process activities within the system [14].
concerted effort is required from cybersecurity This table provides a comparison of the
researchers to develop and research effective advantages and disadvantages of Static
eBPF-based techniques and solutions [12–13]. Analysis and Dynamic Analysis, two commonly
used approaches in analyzing software for
vulnerabilities and malicious behavior. This
3. Methods
information can help make a more informed
decision on which method to use when
Traditional models and methods of detecting analyzing unknown programs.
and counteracting ransomware in computer
Table 1
A comparison of advantages and disadvantages of Static Analysis and Dynamic Analysis
Type of Advantages Disadvantages
Analysis
Static 1. Speed: Can be performed quickly, and doesn't require 1. Obfuscation and polymorphism
Analysis virus execution. 2. Safety: Doesn't pose risks since the issues. 2. Lack of context: Doesn't
program doesn't run. 3. Can analyze code independently provide information on how the
of its execution environment. 4. Early detection of program will behave during
potentially harmful code. execution.
Dynamic 1. Detailed analysis: Gather more information about the 1. Time-consuming. 2. Potential
Analysis program. 2. Effectiveness against code obfuscation. 3. Can risk: Although conducted in a
analyze programs in real-world conditions, considering controlled environment, there's a
specific details of the execution environment 4. Ability to risk the virus may escape it. 3. High

97
track interaction between programs and runtime technical knowledge is required to
processes. interpret analysis results.

The Berkeley Packet Filter (BPF) is a


subsystem within the Linux kernel that
enables users to execute their custom code on
a virtual machine running inside the kernel.
This technology can be categorized into
classical BPF (cBPF) and extended BPF (eBPF).
Classical BPF primarily focuses on inspecting
and analyzing network packets, while the more
advanced eBPF extends its capabilities beyond
merely observing packet information. The
evolution of eBPF has significantly expanded
its potential, allowing users to modify packets, Figure 1: An overview of the eBPF architecture
alter system call arguments, and even modify Moreover, the figure illustrates a program
user space programs. This has transformed functioning within the user space, which
eBPF into a powerful and versatile tool used integrates an eBPF program to attain process-
for various purposes, ranging from networking level visibility in the Linux kernel. The eBPF
to system profiling, tracing, and security program is composed in Python or Golang, and
measures. Over time, enthusiasts within the a compiler that is capable of processing eBPF
Linux community have worked on enhancing bytecode supports it. After loading this eBPF
BPF's functionality, propelling it toward the program into the Linux kernel, the eBPF
current eBPF incarnation. One of the Verification Engine immediately checks its
improvements in eBPF is the shift from 32-bit validity. Furthermore, as mentioned earlier,
registers to 64-bit registers, accommodating a this verification process is crucial in
broader spectrum of use cases and offering preventing possible errors. The program is
better performance. Additionally, eBPF subsequently compiled and connected to the
programs can be attached to distinct kernel appropriate kernel event. However, whenever
events, not only those associated with the syscall event occurs, the program engages
receiving packets. This feature enables in the process, performs its monitoring and
extensive customization and monitoring analysis tasks until completed, and then
capabilities within the Linux kernel. returns the findings to the user program within
Furthermore, eBPF offers improved the user space. Additionally, having gained a
accessibility from user space, allowing users to general overview of the use case and
insert custom actions without overloading or architecture, we can now investigate eBPF’s
destabilizing the operating system. By role in security monitoring more thoroughly.
providing a safe and efficient way for user-
defined programs to interact with the Linux
kernel, eBPF has become a crucial component 4. Security Monitoring and
for Linux-based systems. Its flexible nature and Observability Metrics
extensibility make it an invaluable resource for
developers and system administrators seeking Implementing system call filtering with
high-performance, low-level system eBPF. This mechanism is commonly employed
interaction and customization [15]. to safeguard the OS kernel from untrustworthy
programs. However, current methods are
either costly or lack the programmability
needed to expand security policies. The Linux
filtering module is extensively utilized in
containers, mobile applications, and system
administration.
Contemporary systems communicate with
the OS kernel through system calls. Limiting
these calls helps diminish the attack surface.

98
Linux Seccomp operates within the OS kernel, Kubernetes. At the core of Cilium lies eBPF
offering performance and robustness. technology, which enables powerful security
Nevertheless, cBPF has restricted logic controls and management to be
programmability and does not supply a state dynamically integrated into the Linux system.
storage mechanism. This paper presents a As BPF operates within the Linux kernel, Cilium
programmable system called the filtering security policies can be applied and updated
method using eBPF, aiming to develop without any modifications to the application
advanced security policies without code or container configuration [20].
jeopardizing OS performance and security. Utilizing eBPF for process monitoring.
eBPF was selected due to its practicality. Process monitoring serves as a fundamental
Seccomp has recently incorporated support for component of runtime security. Essentially, it
a custom agent, the Notifier, which functions can detect unexpected processes or execution
alongside cBPF filters [16]. This solution patterns that should not occur in a production
operates similarly to the system call environment. For instance, a web server in a
interception frameworks, delegating decisions production setting should never initiate a shell,
to a trusted user agent. Seccomp intercepts a and a package manager being used to install
system call, halts the calling task, and conveys new dependencies on a host might raise
the call context (e.g., PID, system caller ID, and concerns. To provide a real process tree for
arguments) to the agent [17]. The primary each process, the user space process cache is
drawback of the Seccomp Notifier is the employed. A true process tree refers to the
substantial expense of context switches when lineage of all processes leading to the process
transitioning between user space and the that triggered the alert, regardless of the
kernel. The first paragraph in every section parent processes’ statuses.
does not have a first-line indent. Use only styles This capability is absent from many
embedded in the document [18]. conventional runtime security tools:
Examining network traffic with eBPF. examining the proc file system reveals that
This paper discusses a DDoS defense scenario in when a process terminates, its children
which all inbound malicious traffic is blocked. immediately join the process with the
The authors employ eBPF/XDP to extract identifier. This results in the kernel losing the
features from the incoming traffic and analyze process pedigree context, which could be
the information in the user space using heuristic essential in identifying the host service being
algorithms, which are less precise than neural used [22].
networks. XDP is a form of BPF program that Another intriguing advantage of delving
operates at the initial phase of network packet deeper into the kernel beyond the system call
processing, enabling the gathering of crucial level is the ability to access information that is
data. To designate a BPF program as an XDP typically unavailable in user space. For
program, users must specify the example, the layer of a file in the overlay file
BPF_PROG_TYPE_XDP flag while loading the system. This information carries significant
program into the kernel [19]. Additionally, XDP security implications, as it can determine
programs allow for specific operations to be whether the executed file was part of the
performed on network packets. Once the container’s base image or if it has been
calculations are finished, the results (malicious modified (or created) from the base image’s
IP addresses) are fed into the eBPF programs, original version.
which block all traffic from these sources. In Additionally, process credentials can be
terms of observability solely within a cloud- collected and supplemented with other events,
based microservices environment, the enabling the gathering of a full set of user and
ViperProbe framework was proposed. This tool group IDs, kernel capabilities, and executable
was developed to improve both network and file metadata.
system monitoring using eBPF. Lastly, it’s worth Utilizing eBPF for tracking performance
noting the expanding Cilium platform, open- metrics. Performance metrics serve as
source software designed to seamlessly provide essential indicators for evaluating a computer
network connectivity between applications and system or application’s performance. They
services deployed with Linux container provide insights into resource usage, including
management platforms such as Docker and

99
CPU time, memory, network bandwidth, and it to a file named myps in the
input/output (I/O) performance. /sys/kernel/bpfdump/ directory.
Ransomware is malicious software that If additional information is necessary, it can
encrypts user data and demands payment for be acquired without modifying the kernel.
decryption. It can impact various performance Although this requires some customization
metrics: (each structure type needing accessibility in
• CPU utilization: Ransomware’s encryption this manner requires a specific helper code to
process can heavily utilize the CPU, enumerate the active structures and pass them
resulting in increased CPU load. to the relevant BPF program), it is a one-time
• I/O activity: Encrypting and decrypting endeavor for each type. Thereafter, kernel
numerous files can cause a substantial developers need not worry about exporting
increase in I/O activity, especially when information from that structure type to the
dealing with large files. user space again, at least in theory.
• Memory usage: Certain ransomware can Considering the previously discussed
consume a significant amount of RAM, information, a comprehensive approach to
subsequently affecting the overall system detecting and mitigating ransomware threats
performance. can be developed by leveraging the capabilities
Considering the potential impact of of eBPF.
ransomware on performance, detecting Firstly, eBPF can be used for process
unusual changes in these metrics can serve as monitoring, detecting unexpected processes,
a warning sign for malware presence within and execution patterns that may indicate the
the system. eBPF, with its monitoring presence of ransomware in a system. This
capabilities, can effectively track such changes contributes to the early warning of potential
and identify ransomware activity [22]. threats and aids in maintaining system
Acquiring kernel data using eBPF. Over security.
time, various methods have been developed to Secondly, eBPF enables the tracking of
access data from the OS kernel. BPF has performance metrics such as CPU utilization,
evolved into a versatile tool for addressing I/O activity, and memory usage, which are
diverse challenges, including extracting kernel often impacted by ransomware attacks.
information. Two distinct approaches employ Identifying anomalies in these metrics can
BPF to transfer data from the kernel to the user serve as an additional indicator of ransomware
space using different techniques [23]. activity [24].
Tools such as “ps” are used to retrieve Finally, eBPF allows the extraction of
information by opening /dev/kmem and relevant kernel data, which can be utilized in
operating in the kernel memory space. This the development of advanced security policies.
approach did not require direct kernel Together with monitoring and performance
support, which was advantageous, but it also metric tracking, this kernel-level access
had drawbacks like security concerns and enhances overall threat detection capabilities.
occasional retrieval of random data. Initially, In conclusion, using eBPF as an integrated
this method was acceptable, but modern users tool for process monitoring, performance
sought newer approaches. metric tracking, and kernel data access
Focusing on the case of virtual files, provides a powerful and comprehensive
structural dumpers emerged as a direct approach to detecting and mitigating
approach. Essentially, it enables the ransomware threats effectively in modern
attachment of BPF programs to implement computing environments.
/proc-style files for any supported data
structure. This creates a new virtual file 5. Lab Environment
system, expected to be mounted in
/sys/kernel/bpfdump. For instance, to create a The chief objective of this experimental
new process dumper named “myps”, one can framework is to construct a segregated space,
upload the BPF program generating the robustly guarded against malware
required task structure output and then “pin” propagation or unauthorized data transfer by
employing a Zero Trust security model. The

100
strategic layout employed for this research Ubuntu 22.04 Virtual Machine, dubbed the
project hinges on a dual-layered, isolated Sandbox Host VM. This VM, fortified with a
virtual environment, illustrated in Fig. 2. patched system and a stringent firewall,
permits only SSH and VNC access from a secure
LAN. The second layer of virtualization
burgeons within this VM, manifesting as the
isolated zone dedicated to ransomware
experimentation.
This meticulously designed, dual-layered
virtualized setup stands as a beacon for secure
Figure 2: Overview of the Solution and effective ransomware experimentation,
Architecture ensuring robust isolation and prevention of
The utilization of the KVM Hypervisor unintended malware spread and data
spearheads the entire virtualization process, exfiltration.
paired seamlessly with the libvirt API for adept
communication and management of the virtual 6. Detection Method
machines on the host [25].
The SOC Operator establishes network Detecting ransomware is a process grounded
connections through a virtual network, in the thorough analysis of the distinct
typically anchored by a virtual network switch. behavioral attributes and patterns that
Two operational modes of this switch play ransomware typically exhibits. A detailed
pivotal roles in this setup: examination of various characteristics such as
1. NAT Mode: This default operational file encryption patterns, interactions with
mode provides direct connectivity command-and-control servers, and unusual
among all guests and the virtualization process behaviors provides substantial
host. External network access is granted insights, making it feasible to pinpoint
through network address translation, potential ransomware attacks. Augmenting
subject to the host system’s firewall these analyses, advanced detection
constraints. Despite its comprehensive methodologies harness machine learning
connectivity, its application is restricted algorithms and anomaly detection techniques
to the preliminary setup phase due to to substantially bolster the precision and
security limitations. This phase efficiency in identifying telltale ransomware
encompasses software installation and behaviors.
ransomware sample downloads. There’s a dichotomy in ransomware
2. Isolated Mode: In this secure mode, detection techniques: network-based and
guest virtual machines can interact with host-based detection. Network-based
each other and the virtualization host. detection is a proactive approach, involving
However, traffic remains confined meticulous scrutiny of host traffic to unearth
within the boundaries of the any signs of ransomware activities. Data
virtualization host, preventing any packets from potentially infected hosts and
external communication. This mode is interconnected networks are harvested and
activated during ransomware analyzed. Diverse network traces, including
experimentation to negate any potential DNS queries for command-and-control server
risk of unauthorized public propagation. IP addresses and network storage access
The dual-layered virtualized setup bolsters patterns, could signal the presence of
the research by exploiting the snapshot ransomware activity.
capability. This feature enables the capturing Host-based detection, on the other hand,
of precise snapshots of the primary emphasizes the internal activities within the
virtualization layer at various experimental local system. It includes a comprehensive
stages, ensuring the availability of a pristine examination of both static and dynamic
start point for each subsequent testing round. actions, encompassing file operations, memory
In this secure framework, the first layer of activities, API function calls, and more,
virtualization unfolds by provisioning an presenting a multi-faceted approach to

101
detecting potential ransomware infiltration. In In the quest to detect ransomware activities
this manuscript, a hybrid detection approach is effectively using data collected from eBPF
introduced, blending host-based detection, programs, various machine learning
including initial analysis and filtration, with algorithms were considered. Each algorithm
sophisticated machine-learning holds its unique capabilities in analyzing and
methodologies. The classifiers, as elaborated predicting based on the dataset. After a
above, emerge as pivotal assets in our thorough evaluation and testing phase, the
ransomware detection toolkit. Their adeptness Support Vector Machines (SVM) algorithm
in learning from labeled training data and emerged as the most fitting for this specific
delivering accurate predictions is harnessed to task. The decision to employ SVM in this
enhance the identification of specific research project is rooted in its robustness and
ransomware characteristics and behaviors. flexibility in handling diverse and high-
The real-time classification and identification dimensional datasets. SVM’s ability to identify
of potential threats are made possible by an optimal hyperplane that segregates data
training models on an array of features. This points of varying classes with the most
extensive feature set spans various processes substantial possible margin makes it a
and operations, such as API functions, system compelling choice for detecting the intricate
calls, network traffic patterns, file I/O patterns and activities of ransomware. Its
operations, log files, and more, offering a proficiency in both classification and
comprehensive, robust, and agile solution to regression tasks, coupled with its capability to
ransomware detection. work effectively with linear and non-linear
data using various kernel functions, stands out
7. Machine Learning: Evaluation as a significant advantage. This choice is
aligned to achieve high accuracy and reliability
of the Algorithms and Models in real-time ransomware detection, ensuring
for the Use Case the security and integrity of computer systems.

Table 2
A comparison of ML algorithms
Algorithm Algorithm
Description
Type Name
Classification Random Forest Handles large data sets with higher dimensionality. Can
model non-linear decision boundaries.
Support Vector Effective in high-dimensional spaces. Suitable for binary
Machines (SVM) classification tasks.
Decisions Trees Easy to understand and visualize. Can handle both
numerical and categorical data.
Anomaly Isolation Forrest Efficient for the high-dimensional datasets. Specially
detection designed for anomaly detection.
One-Class SVM Suitable for detecting outliers in high-dimensional
datasets.
Local Outlier Measures the local density deviation of a data point
Factor (LOF) concerning its neighbors.
Clustering K-Means Partitions the dataset into K clusters. Can be used to
Algorithms Clustering identify unusual patterns.
DBSCAN Does not require the number of clusters to be specified.
Can find arbitrarily shaped clusters.

102
Deep Recurrent Suitable for sequential data, such as system call sequences.
Learning Neural
Networks (RNN)
Autoencoders Can be used for anomaly detection by reconstructing input
data.
Time Series Long Short- Effective for time-series data.
Analysis Term Memory
(LSTM)

Supervised Machine Learning stands as a achieves this by transforming the data into a
cornerstone method for deciphering input- higher-dimensional feature space and
output relationship data across diverse fields. It meticulously constructing a decision boundary
operates on a foundation where the system is that amplifies the margin between distinct
trained on a dataset composed of paired input- classes. Its flexibility in handling both linearly
output examples. These datasets, characterized and non-linearly separable data is further
by their labeled outputs, guide the learning enhanced using various kernel functions.
algorithm to understand and internalize the In the context of the present research
intricate mapping between the input and the project, the focus has been mainly on testing
respective outputs. This understanding is the performance of SVM, utilizing both Linear
pivotal for the accurate prediction of output and Radial Basis Function (RBF) kernels. The
values for new, unseen inputs. experimentation and exploration in this
When the output is categorized by discrete project aim to shed light on the various facets
values, denoting different classes, supervised of SVM’s capabilities with these kernels,
learning maneuvers towards classification providing a comprehensive insight into its
tasks. In contrast, the presence of continuous functioning and effectiveness.
output values steers the learning towards
regression tasks. The internal representation 8. Machine Learning Pipeline
of the input-output relationships within the
learning model is signified by specific
In the contemporary digital landscape, the
parameters. These parameters, crucial for the
proliferation of ransomware poses a significant
model’s performance, are calculated during the
threat to the security and integrity of computer
learning phase, especially when there’s no
systems worldwide. Addressing this challenge
direct access to them.
necessitates innovative and robust solutions
The landscape of supervised learning is rich
capable of real-time detection and mitigation of
with diverse algorithms, each with its unique
ransomware activities. This document
strengths. Among them, k-Nearest Neighbors
delineates a strategic machine-learning pipeline
(kNN) and Support Vector Machines (SVM)
designed to harness the data from eBPF
hold significant places. The kNN algorithm
modules for effective ransomware detection.
operates on the principle of proximity. It
The pipeline is meticulously crafted to
classifies new data points based on their
ensure each phase contributes to enhancing the
closeness to labeled examples in the feature
accuracy and reliability of ransomware
space, assigning them the dominant class label
detection, thereby bolstering the security
among the k closest neighbors. It is non-
framework. The pipeline unfolds through five
parametric, considering the distances between
pivotal stages: capturing events from the eBPF
a new data point and all available labeled
module, normalizing the collected data,
training samples for classification.
constructing a data model using the Support
SVM, on the other hand, is renowned for its
Vector Machines (SVM) algorithm, testing the
robustness in both classification and
model’s performance, and ultimately, executing
regression tasks. The algorithm works by
real-time prediction and detection of potential
identifying an optimal hyperplane, aiming to
ransomware activities. Each stage plays a
segregate data points of varying classes with
crucial role in refining the data and the model,
the most substantial possible margin. It
ensuring the delivery of a highly efficient and

103
reliable ransomware detection system. The an optimal hyperplane that segregates the data
subsequent sections provide a detailed insight points, enabling accurate classification and
into each phase of the pipeline, elucidating the prediction of ransomware activities. The
processes, methodologies, and underlying model is trained on a labeled dataset, allowing
rationale that drive the seamless functioning of it to learn and understand the patterns and
this comprehensive machine learning pipeline. behaviors indicative of ransomware.
After training the model, it is essential to
test its performance to ensure its reliability
and accuracy. Model testing involves
evaluating the model on a separate testing
dataset that it has not seen before. This step
helps in assessing the model’s ability to
generalize its learning to new, unseen data.
Various metrics such as accuracy, precision,
recall, and F1-score are used to measure the
model’s performance. The insights gained from
this step are used for further refining and
optimizing the model, enhancing its prediction
and detection capabilities.
The final step in the pipeline is prediction
and detection. With the tested and optimized
Figure 3: Machine Learning: Model Pipeline model, real-time eBPF data is analyzed to make
The machine learning pipeline begins with the predictions and detect potential ransomware
collection of events from the eBPF module. The activities. The model analyzes the incoming
eBPF programs monitor various system data, identifies patterns and behaviors, and
activities and behaviors, capturing relevant makes predictions about possible ransomware
data that may indicate potential ransomware activity. If ransomware activity is detected,
activity. This data includes system call alerts are generated, and necessary actions are
patterns, file access patterns, and other taken to mitigate the threat. This step is crucial
process metadata. The rich and detailed data for providing real-time protection against
collected at this stage forms the foundation for ransomware, ensuring the security and
the subsequent steps in the pipeline, ensuring integrity of the systems.
a comprehensive analysis and accurate
detection. 9. Results
After collecting the events, the next step is
data normalization. This step is crucial for
The implementation of the machine learning
preparing the data for the machine learning
pipeline for ransomware detection using eBPF
model. It involves transforming the raw data
data and the SVM algorithm has yielded
into a consistent format and scale, making it
promising results. This section presents a
more suitable for analysis. Normalization helps
detailed overview of the outcomes,
eliminate any bias or anomalies caused by
demonstrating the effectiveness and efficiency
different scales and formats, ensuring that
of the proposed pipeline.
each feature contributes equally to the model’s
performance. This step enhances the efficiency Data Collection and Normalization
and accuracy of the machine learning model, During the initial phase, the eBPF module
paving the way for more reliable predictions successfully captured a comprehensive dataset
and detections. encompassing various system activities and
With the normalized data in place, the next behaviors. Post normalization, the dataset,
step is to feed this data into the machine- comprising over 100,000 events, was
learning model. In this project, the Support transformed into a consistent and standardized
Vector Machines (SVM) algorithm is used for format, ready for further processing.
building the data model. SVM is chosen for its
Model Training and Testing
robustness and effectiveness in handling high-
dimensional datasets. It works by identifying

104
The SVM model was trained on a dataset of network traffic without needing to
80,000 events and tested on a separate set of restart.
20,000 events. The model exhibited robust • Performance: By executing code on the
performance, achieving an accuracy of 95.2% kernel, eBPF enables high-speed data
on the testing dataset. The other performance processing for real-time monitoring and
metrics were also commendable, with a threat detection.
precision of 94.8%, a recall of 95.5%, and an • Customizability: eBPF allows for the
F1-score of 95.1%. monitoring of specific parameters,
Prediction and Detection aiding in the identification of complex
In the real-time prediction and detection threats such as ransomware.
phase, the model successfully identified and • Integration: eBPF can be seamlessly
alerted for ransomware activities in various integrated with other security tools to
instances. Out of 50,000 real-time events broaden analysis capabilities.
analyzed, the model accurately detected 472 • Risk reduction: eBPF diminishes risks
ransomware activities, with only 3 false associated with traditional methods by
positives, underscoring the model's reliability offering a controlled environment for
and effectiveness. code execution.

Comparative Analysis Considering these advantages, eBPF serves


For a comparative perspective, the same as the perfect solution for contemporary
dataset was also tested using the k-Nearest environments.
Neighbors (kNN) algorithm. The SVM model
outperformed the kNN model, which achieved References
an accuracy of 90.3%, a precision of 89.7%, a
recall of 90.8%, and an F1-score of 90.2%. [1] P. O’Kane, S. Sezer, D. Carlin, Evolution of
Conclusion of Results Ransomware, Iet Networks 7(5) (2018)
The results affirm the robustness and 321–327. doi: 10.1049/iet-net.2017.
reliability of the proposed machine learning 0207.
pipeline for ransomware detection using eBPF [2] I. Opirsky, Vasylyshyn S., and Piskozub A.
data and SVM algorithm. The high accuracy, Analysis of the use of software decoys as
along with excellent precision, recall, and F1- a means of information security,
score, underscores the model’s capability to Cybersecur. Educ. Sci. Technol. 2(10)
effectively detect ransomware activities in real (2020) 88–97. doi: 10.28925/2663-
time, contributing significantly to enhancing 4023.2020.10.8897.
system security and integrity. [3] V. Buriachok, V. Sokolov, P. Skladannyi,
Security Rating Metrics for Distributed
Wireless Systems, in: Workshop of the
10. Summary 8th International Conference on
“Mathematics. Information
eBPF (Extended Berkeley Packet Filter) is Technologies. Education:” Modern
becoming increasingly popular as a security Machine Learning Technologies and
instrument, particularly in cloud settings. Data Science, vol. 2386 (2019) 222–233.
Previously, network monitoring and threat [4] Z. Hu, et al., Development and Operation
detection relied on audits, system logs, and Analysis of Spectrum Monitoring
disk analysis [26]. These methods were Subsystem 2.4–2.5 GHz Range, Data-
resource-intensive, not always effective, and Centric Business and Applications 48
disk analysis was inefficient. Signature analysis (2020) 675–709. doi: 10.1007/978-3-
is unable to detect ransomware, which is 030-43070-2_29
nearly invisible. The primary advantages of [5] I. Bogachuk, V. Sokolov, V. Buriachok,
eBPF include: Monitoring Subsystem for Wireless
• Flexibility and scalability: eBPF permits Systems based on Miniature Spectrum
the use of code within the OS kernel Analyzers, in: 5th International Scientific
without modifying the kernel itself, and Practical Conference Problems of
making it simpler to adjust the system to Infocommunications. Science and

105
Technology (2018) 581–585. doi: g_kernel_attack_surface_with_seccomp-
10.1109/INFOCOMMST.2018.8632151 LPC_2015-Kerrisk.pdf
[6] Moonlock, Russia Was Expected to Wipe [18] Red Canary, eBPF for Security. URL:
Out Ukraine in cyber war. It Hasn’t. URL: https://redcanary.com/blog/ebpf-for-
https://moonlock.com/russia-ukraine- security/
cyber-war?utm_source=pocket_saves [19] S. McCanne, V. Jacobson, The BSD Packet
[7] Y. Shtefaniuk, I. Opirskyi, O. Filter: A New Architecture for User-level
Harasymchuk, Analysis of the Packet Capture, Berkeley: Lawrence
Application of Existing Fake News Berkeley Laboratory (1992).
Recognition Techniques to Counteract [20] Datadog, eBPF. URL: https://www.data
Information Propaganda, Inf. Secur. doghq.com/knowledge-center/ebpf/
26(3) (2020) 139–144. doi: [21] R. Bosworth, The Advantages of eBPF for
10.18372/2225-5036.26.14942. CWPP Applications (2023). URL:
[8] W. Mauerer, Professional Linux Kernel https://www.sentinelone.com/blog/the
Architecture, 1st Edition, Wrox (2008). -advantages-of-ebpf-for-cwpp-
[9] M. Vieira, et al., Fast Packet Processing applications/?__cf_
with eBPF and XDP, ACM Comput. Surv. chl_tk=v6Iv1c1UwTuBEunUiwAqT4zwi
53(1) (2020) 1–36. doi: dsivH4lc.KXINjh9wU-1693063846-0-
10.1145/3371038. gaN ycGzNC6U
[10] S. Miano, et al., Creating Complex [22] J. Corbet, Systemd Gets Seccomp Filter
Network Services with eBPF: Experience Support (2012). URL: https://lwn.net
and Lessons Learned, IEEE 19th Int. Conf. /Articles/507067/
High-Perform. Switch. Rout. (2018) 1–8. [23] J. Edge, A Seccomp Overview (2015).
doi: 10.1109/HPSR.2018.8850758. URL: https://lwn.net/Articles/656307
[11] L. Deri et al., Combining System Visibility [24] V. Buriachok, et al., Invasion Detection
and Security Using eBPF, Italian Model using Two-Stage Criterion of
Conference on Cybersecurity (2019). Detection of Network Anomalies, in:
[12] eBPF, What is eBPF? URL: Workshop on Cybersecurity Providing in
https://ebpf.io/what-is-ebpf/ Information and Telecommunication
[13] Profisea, eBPF: How DevOps Brings Systems, vol. 2746 (2020) 23–32.
Ultimate Observability and Security to [25] S. Rouleau, Process Monitor Hands-On
the Linux Kernel. URL: Labs and Examples (2008). URL:
https://www.profisea.com/devops- https://blogs.technet.microsoft.com/ap
news/ebpf-how-devops-brings- pv/2008/01/24/process-monitor-
ultimate-observability-and-security-to- hands-on-labs-and-examples/
the-linux-kernel/ [26] F. Kipchuk, et al., Assessing Approaches
[14] L. Brotherston, A. Berlin, Defensive of IT Infrastructure Audit, in: IEEE 8th
Security Handbook: Best Practices for International Conference on Problems of
Securing Infrastructure, 1st Edition, Infocommunications, Science and
O'Reilly (2017). Technology (2021). doi: 10.1109/
[15] H. Kuo et al., Verified Programs Can picst54195.2021.9772181.
Party: Optimizing Kernel Extensions via
Post-Verification Merging, 17th
European Conf. Comput. Syst. (2022).
doi: 10.1145/3492321.3519562.
[16] J. Jia, et al., Programmable System Call
Security with eBPF, IBM Research,
Yorktown Heights (2023). doi:
10.48550/arxiv.2302.10366.
[17] M. Kerrisk, Using Seccomp to Limit the
Kernel Attack Surface, in Linux Plumbers
Conference (LPC’15) (2015). URL:
https://man7.org/conf/lpc2015/limitin

106

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy