0% found this document useful (0 votes)

14 views7 pages

Data-Centric Machine Learning Approach For Early Ransomware Detection and Attribution

Uploaded by

Arghya Biswas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views7 pages

Data-Centric Machine Learning Approach For Early Ransomware Detection and Attribution

Uploaded by

Arghya Biswas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

As seen in 8th IEEE/IFIP International Workshop On Analytics For Network And Service Management, May

2023, Miami, Florida, USA.

Data-Centric Machine Learning Approach for

Early Ransomware Detection and Attribution
A. Vehabovic1 , H. Zanddizari2 , N. Ghani1 , F. Shaikh1 , E. Bou-Harb2 , M. Safaei Pour3 , J. Crichigno4
1 Univ. of South Florida, 2 Univ. of Texas San Antonio, 3 San Diego State Univ., 4 Univ. of South Carolina
arXiv:2305.13287v1 [cs.CR] 22 May 2023

Abstract—Researchers have proposed a wide range of dated Windows 7/8 systems. As such, these methods
ransomware detection and analysis schemes. However, may not be applicable to the latest threats facing
most of these efforts have focused on older families Windows 10/11 users. Hence there is a pressing need
targeting Windows 7/8 systems. Hence there is a critical
need to develop efficient solutions to tackle the latest to detect new ransomware designs and classify them
threats, many of which may have relatively fewer samples for improved mitigation, i.e., attribution. Preferably,
to analyze. This paper presents a machine learning (ML) ransomware should be tackled early in the kill-chain
framework for early ransomware detection and attribu- to minimize damage [1]. Since new ransomware re-
tion. The solution pursues a data-centric approach which leases will likely have fewer available samples, solutions
uses a minimalist ransomware dataset and implements
static analysis using portable executable (PE) files. Results must also operate effectively with smaller “minimalist”
for several ML classifiers confirm strong performance in datasets. This requirement is very much in line with
terms of accuracy and zero-day threat detection. current trends in artificial intelligence (AI) to develop
Index Terms—Cybersecurity, malware analysis, ran- more focused “data-centric” solutions [5].
somware detection and attribution Accordingly, this paper presents a novel ML so-
lution for ransomware detection and attribution us-
I. I NTRODUCTION ing static analysis. First, a unique malware repository
Ransomware operates by encrypting files on a host is built by collecting samples of some of the lat-
computer and demanding some form of payment to est ransomware families, i.e., Babuk/Babyk, BlackCat,
release the keys. This malware has become the most Chaos, DJVu/STOP, Hive, LockBit, Netwalker, Sodi-
lucrative revenue source for cybercriminals, and many nokibi/REvil, and WannaCry (after 2017). Next, feature
ransomware “families” have impacted a wide range of extraction is done using Windows portable executable
users. Moreover, numerous cyber-criminal affiliates are (PE) format file information. Finally, several supervised
also offering ransomware-as-a-service (RaaS), further ML classifiers are trained and tested on these extracted
reducing the barrier to such extortion [1]. features, including support vector machines (SVM),
Ransomware follows a multi-stage “kill-chain” com- random forest (RF), extreme gradient boost (XGBoost),
prising of reconnaissance, distribution, installation, and feed-forward neural networks (FNN) [6]. Overall,
communication, encryption, and extortion [2], [3]. To this solution has very amenable run-times and can be
date, numerous designs have been evolved with increas- integrated into network/host-based defenses to target
ing levels of secrecy, speed, and complexity. For exam- ransomware early in the kill-chain (prevention).
ple, various methods have been used to breach systems This paper is organized as follows. Section II reviews
(e.g., remote access, drive-by, and privilege escalation) some key studies on ransomware analysis. Next, Section
and encrypt data in collaboration with command and III details the proposed ML-based framework, including
control (C&C) servers. Data exfiltration has also been dataset collection and feature extraction. Performance
used to extort users (double ransomware) [2]. As this results are then presented in Section IV, followed by
threat continues to grow, surveys indicate that almost future work directions in Section V.
half of large corporations have experienced such attacks
[4]. Windows ransomware is of particular concern as II. L ITERATURE R EVIEW
this operating system (OS) is still the most prevalent. A range of ransomware analysis schemes have been
In light of the above, researchers have proposed a proposed, and survey articles have detailed various
range of ransomware analysis solutions. Many of these (overlapping) taxonomies to classify these methods,
schemes extract information from network traces or e.g., static or dynamic analysis, network- or host-based,
host files/logs to train advanced machine learning (ML) etc [1]- [3]. These efforts are further reviewed here.
classifiers. However, most efforts have focused on a Static analysis examines executable files to detect
specific ransomware family or older families targeting artifacts of maliciousness, e.g., via author attribution,
1
code/segment identification (de-anonymization), etc [1]. fiers and yields over 95% detection rates. Meanwhile,
Some common methods used here include binary code [16] analyzes server message block (SMB) protocol
analysis (BCA), source code analysis via reverse en- patterns to detect older ransomware (2015-2017). The
gineering, and C&C server domain prediction [2]. For NetConverse scheme [17] also uses ML methods to
example, [7] specifies a multi-level framework to de- analyze host traffic for earlier threats and achieves high
tect ransomware from raw binaries, assembly code, detection rates (over 95%). Finally, [18] uses deep learn-
and libraries. ML classifiers are then trained with the ing to analyze network activity and classify abnormal
extracted data, yielding detection rates around 90%. operation in Windows 7. Results show high detection
Meanwhile, [8] transforms code sequences into N-grams rates for several families (over 97%).
and extracts frequency-based features for classification. Meanwhile, dynamic host-based schemes monitor lo-
Results show detection rates around 91% for several cal system activity to detect ransomware, e.g., memory
ML classifiers (decision tree, RF, etc). However, code- and file operations, application programmer interface
based analysis is very labor-intensive [9] and represents (API) function calls, dynamic link library (DLL) calls,
a more latent “post-infection” forensics approach. etc. For example, [19] uses a sandbox to track file
Recent efforts have also used other static features to encryption/deletion, persistent messages, etc. Results
analyze ransomware. For example, [10] leverages image show 96% detection rates for older ransomware types
processing techniques to convert ransomware binary (mid-2010s). Also, [20] presents a scheme to monitor
files into grayscale images and then performs texture and store encryption keys for ransomware detection and
analysis for feature extraction. Results for several ML file recovery. Results show successful mitigation of 12
classifiers show high accuracy (97%) for a small dataset out of 20 families. Similarly, [21] scans input/output
with a mix of old and new ransomwares (379 samples). requests for ransomware activity and flags affected
However, this scheme imposes added computational files. Studies have also proposed ransomware “paranoia”
burdens and does not consider benign applications. schemes that try to detect environments and avoid
Meanwhile, [11] details another static analysis scheme fingerprinting/detection, e.g., [22] tracks API calls.
which extracts entropy and image-based features to train Although the above works present some notable con-
a specialized Siamese NN classifier. Tests with a small tributions, key concerns still remain. Foremost, studies
dataset (about 1,000 samples and 10 families) show have largely focused on older ransomware targeting
accuracy values in the mid-90% range but notably lower Windows 7/8 systems (mid-2010s). Given the expanding
precision and recall rates (upper 70% range). Also, most nature of this threat, it is imperative to study newer
of the ransomware families used here are older (mid- families targeting Windows 10/11. However, there are
2010s) and benign applications are not considered. few datasets here, and new malwares may have smaller
Studies have also used static PE header file data for sample sizes to analyze (a challenge for ML schemes).
broader malware detection (not just ransomware). How- Hence effective “data-centric” [5] schemes are required
ever, these efforts focus on detection and not attribution. for minimalist datasets. Finally, ransomware detection
For example, [12] collects many samples (over 100,000) and attribution schemes must have amenable run-times
from a repository called VX Heaven (now inactive) and preferably target ransomware earlier the distri-
and trains ML classifiers using 7-10 extracted PE file bution/delivery stages to minimize damage [2]. It is
features. Results show detection rates in the upper 90% here that static analysis offers an expedient approach
range. Also, [13] extracts PE features from about 5,500 for tackling malicious payloads prior to infection. By
malware samples and 1,200 benign applications (early contrast, dynamic analysis requires more indepth exam-
2010s). Detection is done using a set of heuristics, ination of network or host activities over longer intervals
achieving 95% accuracy. Finally, [14] extracts 9 PE in virtual environments. As a result, a static analysis
file features (on sections, data directories, and entropy) solution is presented using PE format file analysis.
from a dataset with 1,200 malicious and benign samples
each. Results for several classifiers show 95% detection III. DATA -C ENTRIC S TATIC A NALYSIS USING ML
rates. However, these studies present no details on their The static analysis framework for ransomware detec-
malware datasets, most of which are over a decade old. tion and attribution (classification) is shown in Fig. 1
By contrast, dynamic analysis scans run-time ac- and comprises of several stages. The first stage (Empir-
tions and event sequences for ransomware activity. ical Data Collection) builds an up-to-date repository of
Specifically, dynamic network-based schemes examine some of the latest Windows 10/11 ransomware threats
packet traces for C&C communications, domain name (since 2017). Regular benign Windows-based applica-
service (DNS) queries, network storage access, etc. For tions are also added here to improve classifier perfor-
example, [15] presents a detection system for Locky mance. The second stage (Feature Selection/Extraction)
ransomware which uses traffic features to train classi- processes raw executables to extract key features. An
Fig. 1. Overview of static analysis ML framework for ransomware detection and attribution

efficient static analysis approach is proposed here using samples). However, limited dataset size/diversity can
Windows PE format files. Finally, the last stage (ML also have a negative impact on classifier performance.
Training/Testing) uses the feature datasets to train ML Now many active repositories host malware executa-
classifiers to detect and attribute ransomware. On a high bles, e.g., MalwareBazar, Triage, VirusShare,
level, this setup follows a well-defined ML approach, and VirusTotal, etc. These sites provide varying
similar to that used in other studies. However, the degrees of access and usability, e.g., VirusTotal and
novel contributions here include the collection of new VirusShare require registration to access uploads.
ransomware datasets and extraction of lightweight static Detailed cross-checking and analysis also shows no-
feature sets. Further details are now presented. table duplication across portals, e.g., many Sodinokibi
samples on MalwareBazar match those on Triage.
TABLE I There are also discrepancies between the number of
E MPIRICAL DATASET samples for each family, e.g., DJVu is abundant whereas
Family Samples Avg. Size Avg. PE File
Babuk/Babyk and BlackCat are more scarce. Finally,
Babuk (Babyk) 140 0.19 MB 32.68 KB some repositories (VirusShare and VirusTotal)
BlackCat 120 3.91 MB 1,147 KB do not organize or label their data, further complicating
Chaos 140 0.49 MB 35.2 KB collection. Hence unlabeled data dumps have to be
DJVu (STOP) 140 0.71 MB 66.2 KB tediously analyzed using hashing and cross-checked
Hive 140 3.51 MB 403.9 KB
LockBit 140 1.30 MB 171.5 KB with labelled samples. Hence there is potential for a
Netwalker 140 0.26 MB 35.72 KB lack of diversity, even scarcity, of new ransomware.
Sodinokibi 140 0.30 MB 50.89 KB In light of the above, a smaller “minimalist” data
WannaCry 140 7.62 MB 21.83 KB
repository is curated with 9 active ransomware families,
Benign 2,000 26.86 MB 155.88 KB
i.e., Babuk/Babyk, BlackCat, Chaos, DJVu/STOP, Hive,
LockBit, Netwalker, Sodinokibi/REvil, and WannaCry
(Table I). These families are amongst the most prevalent
A. Empirical Data Collection
ransomware threats in 2022, as per the IBM X-Force
As per Section II, existing studies on PE file analysis Threat Intelligence Index, i.e., LockBit (17%) followed
provide little/no details on their datasets, e.g., type of by WannaCry (11%) and BlackCat (9%). A total of 140
malwares, executable file sizes, collection time frames, unique executables are collected for each family, except
percentage of ransomware, etc. Many of these malwares for BlackCat which only yielded 120 samples due to
are old and related repositories are inactive [12]. Hence scarcity, i.e., total of 1,240 malicious samples. Many
a new repository is curated for the latest ransomware Windows 10/11 applications are also added to construct
families. Now given the rapidly changing nature of the a benign class (2,000 samples). These programs are
ransomware threat, it may be difficult to get sufficient collected from a range of websites and include system
samples of each. Hence realistic “data-centric” ML utility, entertainment, and productivity tools (Fig. 1).
frameworks must achieve good detection and attribution Overall, having a large set of non-malicious training
with minimalist datasets (perhaps only hundreds of data is very beneficial since regular applications down-
loads will exceed (unintended) ransomware downloads, to train/test supervised ML classifiers, i.e., SVM, RF,
This addition contrasts with work in [10], [11]. XGBoost, and FNN [6] (Fig. 1). All evaluation is done
using the Keras and TensorFlow toolkits, as well
B. Feature Selection/Extraction as Pandas and Sklearn. As per Section III-A, a
ML classifier performance is heavily dependent upon total of 9 malicious ransomware families are evaluated
input training data. Hence feature extraction (engineer- along with a set of benign applications, i.e., 10 classes.
ing) plays a vital role in transforming raw executables As noted earlier, there are a total of 1,240 malicious
to generate meaningful information for classifiers [6]. samples (140 samples for each family except BlackCat
As per Section II, static analysis is more expedient for which has 120 samples). The samples for each class
tracking ransomware early in its kill-chain. Hence this are further partitioned to generate separate training and
strategy is applied to Windows PE format files which testing pools. Namely, 20 random samples of each class
contain data structures to support program execution in are selected for testing and the remainder are used for
32-bit and 64-bit Windows OS environments. Namely, training, i.e., 120 training samples for all classes except
these files use the common object file format (COFF) BlackCat which only has 100 samples. Furthermore,
and contain information for the OS loader to setup/run 1,700 benign samples are selected for training and the
wrapped executable code (including memory mapping remaining 300 samples are used for testing. This par-
and permissions). For example, a PE format file has titioning reflects an approximate 85/15 training/testing
several initial lead-in headers along with multiple sec- split. All results are averaged over 100 trial runs, with
tions. Here each section specifies file content (i.e., code each using a different randomized 85/15 partitioning of
or data) and also contains its own section header. the datasets. Detailed findings are now presented.
As per Section II, studies on PE format files have con-
sidered a range of malwares for Windows 7/8 [12]- [14]
(mostly unspecified and not necessarily ransomware).
Hence there is a further need to extend such analysis
to Windows 10/11 ransomware threats. Now PE files
contain a wealth of information, and programs can have
unique non-overlapping parameters (depending upon
functionality). Hence when extracting PE format data, it
is important to select a subset of parameters which exist
across all sample files and also exhibit good variability.
In light of the above, PE files are generated for all
exectuables, with the resultant sizes shown in Table I.
A total of 4 datasets are built by extracting feature Fig. 2. Average multi-class accuracy (100 trials)
vectors with 5, 7, 10, and 15 parameters, labeled as
Datasets 1-4, respectively (Fig. 1). Each successive The average accuracy values (over all runs) are plot-
vector expands upon its predecessor by adding new ted for different feature vector sizes in Fig. 2, i.e., multi-
parameters. Now the exact parameters are chosen using class attribution. Results show improved performance
careful experimentation with the Image File Header, for all schemes with increasing feature vector sizes.
Image Optional Header, and Image Section Header In particular, the SVM and FNN classifiers give the
sections. Some key features include NumberOfSections, best improvement, with accuracy gains of 15-20%.
SizeOfCode, SizeOfHeaders, etc. Note that PE files Conversely, the RF and XGBoost classifiers have much
also contain information on dynamic-link library (DLL) lower gains as feature vector sizes increase from 5 to 15
calls which are indicative of functionality. For example, parameters, i.e., 0.5-1.5% range. These two classifiers
ransomware typically calls encryption, socket communi- also give the best accuracy (94-96% range). However,
cations, and registry-modification functions. Hence the the FNN scheme approaches these methods with 15
total number of DLL calls is also added to the 10 and 15 features, i.e., 91% accuracy. These findings are very
feature vectors (TotalDLLCalls, Fig. 1). Note that this encouraging given the relatively small-sized training
is a computed feature and not an extracted parameter. datasets and feature vectors used. The results also match
those for other schemes using much heavier feature
IV. P ERFORMANCE E VALUATION extraction and ML algorithms, e.g., image and entropy-
The static analysis framework is now evaluated using based features, deep NN algorithms, etc [10], [11].
the data repository from Section III-A. Namely, feature Next, consider attribution errors in more detail. In-
vectors extracted from the PE files are labelled to deed, mis-classifying ransomware as benign is much
generate input datasets. These datasets are then used more harmful than mis-classifying it as the wrong type
of ransomware, i.e., since such errors can allow malware feature vector sizes also give smaller improvements with
to bypass network or host defenses and infect host these classifiers, i.e., 2% range. By contrast, the SVM
machines. Hence to quantify this behavior, a modified and FNN schemes give very poor results for small fea-
ransomware detection rate (RDR) is defined as: ture vectors, with ransomware mis-classification rates
Trs around 50% (1-RDR). These classifiers are also very
RDR = (1) sensitive to feature vector size. Nevertheless, the FNN
Trs + Frs
scheme still approaches the performance of the RF
where Trs is the total number of ransomware samples
and XGBoost schemes with larger feature vectors, i.e.,
classified as (any class of) ransomware, and Frs is the
92% RDR. The BDR results are also plotted in Fig.
total number of ransomware samples mis-classified as
4. As expected, these values are higher than the RDR
benign, i.e., total number of ransomware test samples is
values since a larger amount of benign data is used
(Trs + Frs ). This metric essentially captures the binary
for training. Again, the RF and XGBoost schemes give
detection capability of a multi-class classifier and is
the lowest benign program mis-classification rates, close
similar to the recall formula, i.e., tracks false negatives.
to 99%. Although the other methods (SVM, RF) give
A benign detection rate (BDR) is also defined as:
slightly lower BDR rates, they are still over 92% (less
Tbn than 1 error in 12). Note that these binary detection
BDR = (2)
Tbn + Fbn rates closely match those from other malware detection
where Tbn is the total number of benign samples clas- studies which make use of much larger datasets and
sified as benign, and Fbn is the total number of be- more elaborate feature extraction schemes (Section II).
nign samples mis-classified as ransomware. In general, Meanwhile, Fig. 5 shows an average confusion matrix
though, false negative attribution of benign executables for the XGBoost classifier (classes 0-8 represent the 9
is less of a security concern. ransomware families and class 9 represents the benign
class). Here, the numbers in row 9 are larger as there
are more benign test samples. These results confirm
that most samples are classified correctly, i.e., diagonal
numbers dominate. Moreover, even when ransomware
samples are mis-classified, they are mostly flagged as
another ransomware (mirroring RDR results in Fig. 3).

Fig. 3. Average ransomware detection rate (100 trials)

Fig. 5. Confusion matrix (XGBoost, 15 features)

Tests are also done to gauge zero-day attack de-

tection. Namely, 8 out of the 9 ransomware families
are aggregated into a single malicious class and used
to train a binary classifier (versus benign class). The
Fig. 4. Average benign detection rate (100 trials) remaining family is then tested as a zero-day threat,
i.e., to see if the binary classifier can flag it as ran-
Accordingly, Fig. 3 plots the binary RDR results somware. Hence all samples are either used for training
averaged over 100 trails. Akin to the multi-class case, or testing. The associated detection rates are shown
the RF and XGBoost schemes give the highest ran- in Table II for all possible zero-day attack scenarios.
somware detection rates, close to 95%. Again, larger These results show very good performances for several
classifiers. For example, the RF (XGBoost) scheme future work. Specifically, more ransomware families can
gives approximately 80-99% detection rates for 8 (6) be added to the repository and refined feature extraction
out of the 9 ransomwares tested, i.e., at least 4 out of and ML methods can also be studied.
5 zero-day attacks detected. However, the WannaCry
malware is very effective at evading all schemes and has R EFERENCES
low detection rates in 14-20% range, i.e., only 1 in 7
detected. Hence additional PE file features (parameters) [1] R. Moussaileb, N. Cuppens, J.-L. Lanet, and Bouder, “A survey
on windows-based ransomware taxonomy and detection mech-
or other static parameters may need to be incorporated. anisms: Case closed?” ACM Computing Surveys, vol. 54, no. 6,
July 2022.
TABLE II [2] A. Vehabovic, N. Ghani, E. Bou-Harb, J. Crichigno, and A. Yay-
Z ERO - DAY ATTACKS ( DETECTION ACCURACY ) imli, “Ransomware detection and classification strategies,” in
IEEE Black Sea Comm 2022, Sofia, Bulgaria, June 2022.
Zero-Day SVM RF XGB FNN [3] E. Berrueta, D. Morato, E. Magaña, and M. Izal, “A survey
Babuk/Babyk 81.43% 94.86% 93.57% 74.38% on detection techniques for cryptographic ransomware,” IEEE
BlackCat 36.67% 78.17% 35.83% 65.70% Access, vol. 7, pp. 144 925–144 944, October 2019.
Chaos 93.57% 87.00% 87.14% 83.95% [4] A. Kapoor, “Ransomware detection, avoidance, and mitigation
DJVu/STOP 82.86% 98.64% 90.71% 79.40% scheme: A review and future directions,” Sustainability, vol. 14,
no. 1, December 2021.
Hive 15.71% 82.71% 82.88% 72.30%
[5] A. Ng, “Ai minimalist,” IEEE Spectrum, vol. 59, no. 4, pp. 23–
LockBit 57.14% 84.79% 62.86% 63.06%
25, April 2022.
NetWalker 95.00% 97.36% 97.14% 98.21%
[6] A. Geron, Hands-On Machine Learning with Scikit-Learn,
Sodinokibi/REvil 95.00% 91.00% 90.05% 89.86%
Keras, and TensorFlow, Second Edition. O’Reily Media, 2019.
WannaCry 13.57% 14.86% 14.29% 19.81%
[7] S. Poudyal, K. P. Subedi, and D. Dasgupta, “A framework
for analyzing ransomware using machine learning.” IEEE 2018
Finally, run-times are measured by averaging PE SSCI, Nov. 2018.
format file generation, feature extraction, and ML at- [8] H. Zhang, X. Xiao, F. Mercaldo, S. Ni, F. Martinelli, and
A. Sangaiah, “Classification of ransomware families with ma-
tribution times. Tests are done on a Windows 11 server chine learning based onn-gram of opcodes.” Future Generation
with a 3.60 GHz Intel Core i9 processor and 64 Computer Systems, vol. 90, pp. 211–221, 2019.
Gb of random access memory (RAM). Only trained [9] D. Mulders, “Network based ransomware detection on the
samba protocol,” MS Thesis, Department of Mathematics, TU
ML models are timed to reflect operational settings, Eindhoven, 2017.
and PE file generation is done using a Github pack- [10] B. Wang, H. Liu, X. Han, and D. Xuan, “Image-based ran-
age (https://github.com/erocarrera/pefile). Results show somware classification with classifier combination,” in ACM
Advanced Information Science and System (ACM AISS) 2021,
that PE file generation times are directly correlated to Sanya, China, November 2021.
executable file sizes (Table I). For example, benign files [11] J. Zhu, J. Jaccard, A. Singh, I. Welch, and H. A.-S. amd
average 1.46 sec, whereas ransomware files range from S. Camtepe, “A few-shot meta-learning based siamese neural
network using entropy features for ransomware classification,”
50-300 ms (larger for LockBit, DJVu/STOP, and Hive). Computers & Security, vol. 117, pp. 1–11, June 2022.
Meanwhile, classification times vary between 2-8 ms for [12] D. Kim, S. Woo, D. Lee, and T. Chung, “Static detection of
the SVM, RF, and XGboost classifiers, but are higher malware and benign executable using machine learning,” in
for the FNN scheme at 47 ms. Overall, many operators Internet 2016, Barcelona, Spain, November 2016.
[13] Y. Liao, “Pe-header-based malware study and detection,” in
are willing to accept these delays for scanning incoming Semantic Scholar, 2021.
downloads/attachments in network/host-based defenses. [14] T. Rezaei and A. Hamze, “An efficient approach for malware
detection using pe header specifications,” in 6th International
V. C ONCLUSIONS Conference on Web Research (ICWR), Tehran, Iran, April 2020.
[15] A. Almashhadani, M. Kaiiali, S. Sezer, and P. O’Kane, “A multi-
It is imperative to track the latest ransomware releases classifier network-based crypto ransomware detection system: A
and develop effective solutions for mitigating these case study of locky ransomware,” IEEE Access, vol. 7, no. 1,
threats. This paper presents a static analysis scheme pp. 47 053–47 067, 2019.
[16] D. Morato, E. Berrueta, E. Magaña, and M. Izal, “Ransomware
for ransomware detection and attribution. First, a new early detection by the analysis of file sharing traffic,” Journal of
dataset is curated with the latest Windows 10/11 ran- Network and Computer App., vol. 124, no. 1, pp. 14–32, 2018.
somware families. Windows portable executable (PE) [17] O. Alhawi, J. Baldwin, and A. Dehghantanha, “Leveraging
machine learning techniques for windows ransomware network
format files are then used to extract feature vectors and traffic detection,” Adv. in Info. Security, p. 93–106, July 2018.
train machine learning (ML) classifiers. Overall findings [18] K. C. Roy and Q. Chen, “Deepran: Attention-based bilstm and
show very good performance in terms of ransomware crf for ransomware early detection and classification. informa-
tion systems frontiers,” Information Systems Frontiers, vol. 0,
detection, attribution, and zero-day threat detection. pp. 1–17, 2021.
These results are achieved using minimalist datasets [19] A. Kharaz, S. Arshad, C. Mulliner, W. Robertson, and E. Kirda,
with about 100-120 training samples per class and “Unveil: A large-scale, automated approach to detecting ran-
relatively compact feature vectors. The solution also somware,” in USENIX Security 2016, Austin, TX, August 2016.
[20] E. Kolodenker, W. Koch, G. Stringhini, and M. Egele, “Pay-
gives very amenable run-times for realistic settings. This break: Defense against cryptographic ransomware,” in ACM Asia
work presents a strong basis from which to expand into CCS 2017, Abu Dhabi, UAE, April 2017.
[21] A. Kharraz and E. Kirda, “Redemption: Real-time protection
against ransomware at end-hosts,” in RAID 2017, Atlanta, GA,
October 2017.
[22] A. AlSabeh, H. Safa, E. Bou-Harb, and J. Crichigno, “Exploiting
ransomware paranoia for execution prevention,” in IEEE ICC
2020, Dublin, Ireland, June 2020.

AI Powered Ransomware Poudyal
No ratings yet
AI Powered Ransomware Poudyal
9 pages
Automated Behavioral Analysis of Malware: A Case Study of Wannacry Ransomware
No ratings yet
Automated Behavioral Analysis of Malware: A Case Study of Wannacry Ransomware
7 pages
Site Instruction and Variation Order Procedures For Contractors Manual PDF
100% (3)
Site Instruction and Variation Order Procedures For Contractors Manual PDF
31 pages
Binary Level Analysis
No ratings yet
Binary Level Analysis
159 pages
Peerj Cs 2546
No ratings yet
Peerj Cs 2546
44 pages
Tlerad Sci
No ratings yet
Tlerad Sci
28 pages
Sic - Et - Non by Peter Abelard
100% (2)
Sic - Et - Non by Peter Abelard
475 pages
Ransomware Detection Using Machine Learning With EBPF
No ratings yet
Ransomware Detection Using Machine Learning With EBPF
20 pages
Futureinternet 16 00291 v2
No ratings yet
Futureinternet 16 00291 v2
37 pages
Applsci 12 00172 v2
No ratings yet
Applsci 12 00172 v2
45 pages
1 PB
No ratings yet
1 PB
8 pages
IBM Zurich Paper Block Storage
No ratings yet
IBM Zurich Paper Block Storage
16 pages
MIRAD A Method For Interpretable Ransomware Attack Detection
No ratings yet
MIRAD A Method For Interpretable Ransomware Attack Detection
19 pages
v1 Covered
No ratings yet
v1 Covered
10 pages
Ransomware Detector (Review 3)
No ratings yet
Ransomware Detector (Review 3)
30 pages
Information 15 00046
No ratings yet
Information 15 00046
29 pages
Published Article
No ratings yet
Published Article
30 pages
2021A Survey On Windows-Based Ransomware Taxonomy and Detection Mechanisms
No ratings yet
2021A Survey On Windows-Based Ransomware Taxonomy and Detection Mechanisms
36 pages
AI Research Paper
No ratings yet
AI Research Paper
9 pages
1 s2.0 S0167739X18307325 Main
No ratings yet
1 s2.0 S0167739X18307325 Main
11 pages
Machine Learning and Cryptographic Algorithms - Analysis and Design in Ransomware and Vulnerabilities Detection
No ratings yet
Machine Learning and Cryptographic Algorithms - Analysis and Design in Ransomware and Vulnerabilities Detection
9 pages
Sustainability 14 00008
No ratings yet
Sustainability 14 00008
24 pages
Research 5
No ratings yet
Research 5
22 pages
A Survey of The Recent Trends in Deep Le
No ratings yet
A Survey of The Recent Trends in Deep Le
30 pages
Ransomware Attack Detection Using Supervised Machine Learning Classifiers
No ratings yet
Ransomware Attack Detection Using Supervised Machine Learning Classifiers
44 pages
Information 15 00194 v2
No ratings yet
Information 15 00194 v2
17 pages
Raise of Ransomware
No ratings yet
Raise of Ransomware
14 pages
Storage
No ratings yet
Storage
6 pages
Enhancing Ransomware Detection A Registry Analysis-Based Approach
No ratings yet
Enhancing Ransomware Detection A Registry Analysis-Based Approach
6 pages
Ransomware Attack Modeling and Artificial Intelligence-Based Ransomware Detection For Digital Substations
No ratings yet
Ransomware Attack Modeling and Artificial Intelligence-Based Ransomware Detection For Digital Substations
5 pages
Ijsse 14.06 30
No ratings yet
Ijsse 14.06 30
11 pages
20 - Ransomware Detection Using Machine Learning - A Survey
No ratings yet
20 - Ransomware Detection Using Machine Learning - A Survey
24 pages
Sjeat 104 159-168
No ratings yet
Sjeat 104 159-168
10 pages
Detecting Ransomware Using Support Vector Machines: Yuki Takeuchi Kazuya Sakai Satoshi Fukumoto
No ratings yet
Detecting Ransomware Using Support Vector Machines: Yuki Takeuchi Kazuya Sakai Satoshi Fukumoto
6 pages
Ransom
No ratings yet
Ransom
3 pages
Ransomewareattack Review1
No ratings yet
Ransomewareattack Review1
9 pages
Ransomware Attack Detection Formatting - Edited
No ratings yet
Ransomware Attack Detection Formatting - Edited
11 pages
Federated RNN-Based Detection of Ransomware Attacks: A Privacy-Preserving Approach
No ratings yet
Federated RNN-Based Detection of Ransomware Attacks: A Privacy-Preserving Approach
9 pages
The Rise of Ransomware Trends, Impacts, and AI-Driven
No ratings yet
The Rise of Ransomware Trends, Impacts, and AI-Driven
8 pages
Usenixsec2016unveil Paper
No ratings yet
Usenixsec2016unveil Paper
17 pages
MalClassifier Malware Family Classification Using Network Flow Sequence
No ratings yet
MalClassifier Malware Family Classification Using Network Flow Sequence
13 pages
Research Article: Digital Forensics As Advanced Ransomware Pre-Attack Detection Algorithm For Endpoint Data Protection
No ratings yet
Research Article: Digital Forensics As Advanced Ransomware Pre-Attack Detection Algorithm For Endpoint Data Protection
16 pages
Presentation On Analysis of Ransomware Attacks
No ratings yet
Presentation On Analysis of Ransomware Attacks
23 pages
Applsci 12 12941 v2
No ratings yet
Applsci 12 12941 v2
22 pages
A Malicious Code Detection Method Based On Stacked Depthwise Separable Convolutions and Attention Mechanism
No ratings yet
A Malicious Code Detection Method Based On Stacked Depthwise Separable Convolutions and Attention Mechanism
27 pages
Ransomware and Wannacry Overview
No ratings yet
Ransomware and Wannacry Overview
5 pages
2.1 Malware Detection Based On Opcode Frequency (2016)
No ratings yet
2.1 Malware Detection Based On Opcode Frequency (2016)
9 pages
Malware Detection and Classification Based On Graph Convolutional Networks and Function Call Graphs
No ratings yet
Malware Detection and Classification Based On Graph Convolutional Networks and Function Call Graphs
11 pages
Malware Classification Based On Image Segmentation: Wanhu
No ratings yet
Malware Classification Based On Image Segmentation: Wanhu
8 pages
Machine Learning Algorithms and Frameworks in Ransomware Detection
No ratings yet
Machine Learning Algorithms and Frameworks in Ransomware Detection
14 pages
Ransomware Attack Detection Based On Pertinent System Calls Using Machine Learning Techniques
No ratings yet
Ransomware Attack Detection Based On Pertinent System Calls Using Machine Learning Techniques
23 pages
Ransomware Attack Detection Based On Pertinent System Calls Using Machine Learning Techniques
No ratings yet
Ransomware Attack Detection Based On Pertinent System Calls Using Machine Learning Techniques
23 pages
A3-Static Malware Analysis To Identify Ransomware Properties
No ratings yet
A3-Static Malware Analysis To Identify Ransomware Properties
8 pages
EGR System Diagnostic Procedures
No ratings yet
EGR System Diagnostic Procedures
7 pages
A Novel Ensemble-Based Approach For Windows Malware Detection
No ratings yet
A Novel Ensemble-Based Approach For Windows Malware Detection
10 pages
AFrameworkforAnalyzingRansomwareusingMachineLearning Con Notas - MH
No ratings yet
AFrameworkforAnalyzingRansomwareusingMachineLearning Con Notas - MH
9 pages
Multi Level Ransomware Detection Framework
No ratings yet
Multi Level Ransomware Detection Framework
8 pages
Federal Complaint Against Barrett Daffin Frappier Treder and Weiss For Legal Malpractice
100% (2)
Federal Complaint Against Barrett Daffin Frappier Treder and Weiss For Legal Malpractice
9 pages
805-Article Text-3656-1-10-20220310
No ratings yet
805-Article Text-3656-1-10-20220310
16 pages
Enterprise Architecture PDF
No ratings yet
Enterprise Architecture PDF
175 pages
Identifying Ransomware-Specific Properties Using Static Analysis of Executables
No ratings yet
Identifying Ransomware-Specific Properties Using Static Analysis of Executables
10 pages
Towards Resilient Machine Learning For Ransomware Detection - 1812.09400
No ratings yet
Towards Resilient Machine Learning For Ransomware Detection - 1812.09400
10 pages
RS21DLMR
No ratings yet
RS21DLMR
98 pages
Research Paper 3
No ratings yet
Research Paper 3
5 pages
IFRS 15 Summary PDF
No ratings yet
IFRS 15 Summary PDF
8 pages
Rdso LHB Modifications - BSB - 21.08.2023
No ratings yet
Rdso LHB Modifications - BSB - 21.08.2023
3 pages
Detecting Ransomware Using Process Behavior Analys
No ratings yet
Detecting Ransomware Using Process Behavior Analys
8 pages
Sterling N Computing
No ratings yet
Sterling N Computing
2 pages
The Customer Journey of The Premium Traveler PDF
No ratings yet
The Customer Journey of The Premium Traveler PDF
38 pages
Bintulu HR Management Sarawak Labour Ordinance
No ratings yet
Bintulu HR Management Sarawak Labour Ordinance
6 pages
CMADD NEW Syllabus
No ratings yet
CMADD NEW Syllabus
224 pages
Technology Newsletter
No ratings yet
Technology Newsletter
5 pages
CSIR CLRI Junior Secretariat Assistant Paper II 2018 English
No ratings yet
CSIR CLRI Junior Secretariat Assistant Paper II 2018 English
24 pages
NPCIL RFP For BSR - 202412300511513510127RFP - Document - 31122024 - 01
No ratings yet
NPCIL RFP For BSR - 202412300511513510127RFP - Document - 31122024 - 01
89 pages
Smartax Mt800 Adsl Router: User Manual
No ratings yet
Smartax Mt800 Adsl Router: User Manual
109 pages
Impact of Covid-19 in Business
0% (1)
Impact of Covid-19 in Business
17 pages
Mini Monitor Module Installation Guide: Troubleshooting
No ratings yet
Mini Monitor Module Installation Guide: Troubleshooting
2 pages
NORSOK STANDARD M-650 Edition 4 Qualification of Manufacturers of Special Materials
No ratings yet
NORSOK STANDARD M-650 Edition 4 Qualification of Manufacturers of Special Materials
19 pages
Customizing The Windchill 9 User Interface
No ratings yet
Customizing The Windchill 9 User Interface
3 pages
Biogas Promotion China
No ratings yet
Biogas Promotion China
7 pages
SQC L9
No ratings yet
SQC L9
33 pages
NSDL Conversion Request Form
No ratings yet
NSDL Conversion Request Form
1 page
Chapter Four: International Management and Cross-Cultural Competence
No ratings yet
Chapter Four: International Management and Cross-Cultural Competence
35 pages
He 2021
No ratings yet
He 2021
17 pages
Cabbage: Schedule of Cabbage Production Practices
No ratings yet
Cabbage: Schedule of Cabbage Production Practices
19 pages
Properties of Materials - Convection Conduction Current
No ratings yet
Properties of Materials - Convection Conduction Current
19 pages
7 TH
No ratings yet
7 TH
21 pages
MATULAC Activity 1 MidTerm
No ratings yet
MATULAC Activity 1 MidTerm
3 pages
Luo 2018
No ratings yet
Luo 2018
10 pages
Performance Review of Thermal Power Stations 2011-12: Sl. No Name of Station Unit No Organisation Capacity
No ratings yet
Performance Review of Thermal Power Stations 2011-12: Sl. No Name of Station Unit No Organisation Capacity
4 pages
Lab Rheology and Injection Molding - 1
No ratings yet
Lab Rheology and Injection Molding - 1
3 pages
NFC Pre-Authorization API v1.01
No ratings yet
NFC Pre-Authorization API v1.01
5 pages
Head Assy
No ratings yet
Head Assy
1 page
Penetration Testing Fundamentals-2: Penetration Testing Study Guide To Breaking Into Systems
From Everand
Penetration Testing Fundamentals-2: Penetration Testing Study Guide To Breaking Into Systems
Devi Prasad
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Data-Centric Machine Learning Approach For Early Ransomware Detection and Attribution

Uploaded by

Data-Centric Machine Learning Approach For Early Ransomware Detection and Attribution

Uploaded by

As seen in 8th IEEE/IFIP International Workshop On Analytics For Network And Service Management, May

2023, Miami, Florida, USA.

Data-Centric Machine Learning Approach for

Fig. 3. Average ransomware detection rate (100 trials)

Fig. 5. Confusion matrix (XGBoost, 15 features)

Tests are also done to gauge zero-day attack de-

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.