0% found this document useful (0 votes)
40 views13 pages

Proud Mal Static Analysis Based Malware Analysis For Exes

This document proposes a progressive deep unsupervised framework called PROUD-MAL for classifying Windows portable executable (PE) files using static analysis. It uses a two-phase approach of unsupervised clustering followed by an attention-based deep neural network with a feature attention block. The framework is evaluated on a novel real-time malware dataset of over 15,000 PE files collected over seven months. Results show the proposed model achieves over 98% accuracy, outperforming other supervised and unsupervised methods.

Uploaded by

Shahzad Saleem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views13 pages

Proud Mal Static Analysis Based Malware Analysis For Exes

This document proposes a progressive deep unsupervised framework called PROUD-MAL for classifying Windows portable executable (PE) files using static analysis. It uses a two-phase approach of unsupervised clustering followed by an attention-based deep neural network with a feature attention block. The framework is evaluated on a novel real-time malware dataset of over 15,000 PE files collected over seven months. Results show the proposed model achieves over 98% accuracy, outperforming other supervised and unsupervised methods.

Uploaded by

Shahzad Saleem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Complex & Intelligent Systems

https://doi.org/10.1007/s40747-021-00560-1

ORIGINAL ARTICLE

PROUD-MAL: static analysis-based progressive framework for deep


unsupervised malware classification of windows portable executable
Syed Khurram Jah Rizvi1,3 · Warda Aslam1 · Muhammad Shahzad1 · Shahzad Saleem1,4 ·
Muhammad Moazam Fraz1,2

Received: 30 January 2021 / Accepted: 24 September 2021


© The Author(s) 2021

Abstract
Enterprises are striving to remain protected against malware-based cyber-attacks on their infrastructure, facilit ies, networks
and systems. Static analysis is an effective approach to detect the malware, i.e., malicious Portable Executable (PE). It performs
an in-depth analysis of PE files without executing, which is highly useful to minimize the risk of malicious PE contaminating
the system. Yet, instant detection using static analysis has become very difficult due to the exponential rise in volume and
variety of malware. The compelling need of early stage detection of malware-based attacks significantly motivates research
inclination towards automated malware detection. The recent machine learning aided malware detection approaches using
static analysis are mostly supervised. Supervised malware detection using static analysis requires manual labelling and human
feedback; therefore, it is less effective in rapidly evolutionary and dynamic threat space. To this end, we propose a progressive
deep unsupervised framework with feature attention block for static analysis-based malware detection (PROUD-MAL). The
framework is based on cascading blocks of unsupervised clustering and features attention-based deep neural network. The
proposed deep neural network embedded with feature attention block is trained on the pseudo labels. To evaluate the proposed
unsupervised framework, we collected a real-time malware dataset by deploying low and high interaction honeypots on
an enterprise organizational network. Moreover, endpoint security solution is also deployed on an enterprise organizational
network to collect malware samples. After post processing and cleaning, the novel dataset consists of 15,457 PE samples
comprising 8775 malicious and 6681 benign ones. The proposed PROUD-MAL framework achieved an accuracy of more
than 98.09% with better quantitative performance in standard evaluation parameters on collected dataset and outperformed
other conventional machine learning algorithms. The implementation and dataset are available at https://bit.ly/35Sne3a.

Keywords Unsupervised classification · Progressive learning · Malware detection · Static analysis · Feature attention

Introduction malware-based cyber threat spectrum. The Malware is an


abbreviated form of “malicious software” and it is a set of
Cybersecurity has become an irresistible concern for enter- instructions intended to bring fatal damages to enterprises,
prises across the globe keeping in view the sensitivity of the infrastructure, industrial processes and digital systems. It is
information as most valuable asset. In this information age, a lethal cyber weapon used for unauthorized access, cyber
organizations are facing an ever expanding and sophisticated espionage, cyber terror, identity theft, data exfiltration or
corruption, service interruption or failure, data hostage for
B Syed Khurram Jah Rizvi ransom etc. In 2018, more than 430 million unique sam-
srizvi.phd15seecs@seecs.edu.pk; syed.rizvi@warwick.ac.uk ples of malware were detected in with an annual increase
1 of 36% [1]. The significant annual increase of 25% and
National University of Sciences and Technology (NUST),
Islamabad, Pakistan 1000% is also observed in use of malwares and malicious
2 PowerShell scripts, respectively. According to Kaspersky
The Alan Turing Institute, London NW1 2DB, UK
Labs (2019), more than 100 million different hosts were
3 Department of Computer Science, University of attacked by mid-2019. In 2018, more than 889,452 internet
Warwick, Coventry CV47AL, UK
banking users were targeted by of banking Trojans with an
4 Department of Computer Science, University of increase of 15.9% in comparison with previous year. Accord-
Jeddah, Jeddah, Kingdom of Saudi Arabia

123
Complex & Intelligent Systems

ing to a recent analysis by Juniper Research, the financial malware classification. This paper proposed a progressive
impact of data breaches will increase by 11% per year and deep unsupervised framework (PROUD-MAL) for classify-
will reach a level from $3 trillion to $5 trillion in 2024. ing Windows PE using static analysis of executables. The
Therefore, it is the utmost requirement of every business major contributions are descried as follows:
to protect its information-based assets, since even a sin-
gle attack can result in critical data loss. There are several
classes of malware including [11] Ransomware [11], Trojan (a) The purpose of research is to present a framework
[14], Key Logger [3], Backdoor [21], Launcher [13], Remote for unsupervised classification of Portable Executables
Access toolkits (RAT) [33], Spam-Sending malware [34] etc. (PEs) using static features. We term this framework as
The approaches for Malware detection are either signature- PROUD-MAL To this end, we propose a two-phase cas-
based [2] or behavior-based [31]; while first approach is caded formulation of progressive unsupervised cluster-
good for identification of known attacks without produc- ing followed by an attention-based deep neural network
ing an overwhelming false alarm [3] but requires frequent for static feature-based malware classification.
manual updates of the database with rules and signatures. (b) Moreover, it is worth mentioning that attention mod-
On the other hand, later approach can be used to generalize els have shown promising outputs in various domains
signatures related to host and network used to identify the such as image analysis and natural language process-
presence of an unwanted piece of code or activity on victim ing but to best of our knowledge, there is no research
computers or networks. The use of packers [46], encryp- on applying the attention-based mechanism for mal-
tion [5], polymorphism [31] and obfuscation [28] techniques ware classification using unsupervised clustering over
can easily bypass signature-based detection as they only per- static features. To this end, we propose a Feature
form pattern or string matching [11]. Behavior-based [36] Attention-based Neural Network (FANN) architecture
approaches that focus on pattern identification including file for malware classification. The Attention Block (AB)
activity, registry activity and API call [8]; are either based considers correlation of a feature to target or other
upon static [7] or dynamic analysis [6]. The latter form of features, besides considering feature weight. It puts rel-
analysis requires execution of the malicious code [35] in a atively more weights to feature that contributed more to
controlled setup, i.e., sandbox and is often slow, resource minimize the validation loss and maximize the classifi-
intensive and not suitable for the deployment in the pro- cation accuracy.
duction environment which are also discussed in by [22]. (c) We also collected a novel real-time malware dataset
Moreover, due to geometric rise in zero-day malware, exist- comprises 15,457 (25 GB) PE samples collected over
ing approaches have become less efficient for detection of a period of seven (07) months (200 days). The novel
zero-day attacks and there is a dire need of automated mal- dataset is collected by deploying low and high interac-
ware detection and classification system equipped with the tion honeypots as well as enterprise endpoint security
machine learning techniques [9].The machine learning can solution over a large organizational network. It is avail-
be either supervised or unsupervised, i.e., supervised learn- able publicly for research community.
ing or discriminative deep architectures conducts the training (d) The quantitative assessment reflects that the proposed
over labelled data, i.e., classification, regression or predictive model achieved superior performance and outperformed
analytics whereas unsupervised learning or so called gener- state-of-the-art supervised approaches as well unsuper-
ative architectures draws inferences from datasets consisting vised one. The high yield of classification accuracy
of input data without labels [43]. demonstrated the significance and utility of the proposed
Keeping in view the ongoing huge growth in number framework.
of malwares, time-based complexity for malware analysis,
acute number of domain experts and demand of earliest
detection, considerable research on machine-learning-based The rest of the paper organization is as follows: section
techniques is being conducted for automated malware analy- “Background and context” describes background and struc-
sis and classification [10, 19] but most of the static analysis- ture of windows-based PEs. Section “Related work” narrates
based approaches are supervised in nature. The availability the related work. Section “Methodology and architecture”
of updated malware dataset along with the labels is also a describes the dataset acquisition, data pre-processing, fea-
major hurdle for malware analysts. The aforementioned lim- ture extraction and proposed framework, i.e., PROUD-MAL
itations and gap motivated the development of automated followed by FANN architecture. Section “Experiments and
unsupervised malware analysis system for investigation of results ”narrates the implementation details including the
portable executable to make a classification decision based experimental setup, results obtained and discussion. Section
on static analysis. Moreover, it is essential to have a suitable “Conclusion and future direction” narrates the concluding
representation of feature vectors to make decision regarding remarks followed by the future direction.

123
Complex & Intelligent Systems

Background and context Table 1 PE sections


Name Description
Malware can be an executable or a non-executable binary and
its classification is based on either dynamic or static anal- .text Executable code
ysis. The former approach involves the execution of a PE .bss Uninitialized and static variables
in a controlled environment to study its behavior including .rdata Read only data, i.e., constants, string literals
auto-start extensibility points, function calls and parameter .data Local and global variables except the automatic variables
analysis, data flow tracking [11] but it is more time consum- .rsrc Read only information, i.e., resource name, type of icon,
ing and computationally expensive, therefore, the adoption bitmap, dialog box
of dynamic analysis in production environment is not appre- .edata Export data for an exe or DLL
ciated. The static analysis includes source code inspection .idata Functions imported by an exe or DLL, import directory,
import address table
[12] without any execution in controlled setup that involves
decompression/unpacking of PE, if it is encoded by a third- .pdata Array of function table entries used for exception handling
party packer [11] and disassembling for the purpose of .debug Compiler-generated debug information
obtaining codes residing in memory [14]. The disassembler
and memory dumper software packages, e.g., OllyDump and
LordPE can be utilized. The windows-based PE file can be (g) Sections Contains executable code, resources and
an executable, Dynamic Link Library (DLL) [13] or object operands for PE unlike headers that provide information
code and inherits many features from Unix-specific Common about executable. There are nine predefined sections.
Object File format (COFF). The PE content is semantically The names and description of each section are listed in
structured [21] that is important to understand for good Table 1. All sections may not be present in a PE. The
analysis. The format is supported by various architectures missing idata does not mean there is no import table as
including Intel, variants of ARM as well as AMD instruc- it may be in.data or. edata section.
tion sets. The PE has numerous predefined blocks including
a number of headers and sections. The section contains a
header that provides information regarding the address and
size. The predefined blocks are explained as follows: Related work

In literature, several approaches for malware detection based


(a) DOS Header Defines file as an executable binary or file on machine learning techniques have been proposed. Some
and also called as MZ header. It provides information of the research work based on machine learning algorithms
about four-byte offset address of PE header. specific to PE file malware classification is discussed here.
(b) DOS Stub Small embedded program to display an appro- The Malware is a set of instructions developed to bring
priate message whenever there is an attempt to run a PE harmful consequences to organizations, their process, net-
file in DOS. works as well as infrastructure. The Malware can be an
(c) PE File Header (Signature) Defines an executable file as executable or a non-executable entity and its detection is
PE. It provides information about machine compatibil- based on either static or dynamic analysis. In 2001, a machine
ity, number of sections/symbols, compiler time-stamp learning framework was proposed for classification of PE
and size of optional header, i.e., next unit. files using static analysis and the utilization of data mining
(d) Optional Header or Image Optional Header Mandatory techniques for the extraction of strings and byte sequence fea-
contrary to its title and provides details including entry ture from PE [15]. In 2009, Researchers [16] extracted 5-g
point address, os version, image and data base, image byte sequences from file header and applied term frequency-
and subsystem version, the version of linker and size inverse document frequency approach for classification. In
of the code, initialized and uninitialized data, stack and 2013, a malware detection system was proposed for analysis
heap of PE files using byte sequence alternatively known as n-gram
(e) Data Directories Successor of Optional Header. It sequence that is less efficient and computationally expen-
gives details about directories including export, import, sive [17]. In 2015, researchers [18] proposed heuristic-based
exception, relocation table, global pointer, debug and detection technique for metamorphic malware while using
load configuration. used static features for PE analysis. In the proposed model,
(f) Section Table & Header Preceded by data directories file was disassembled using IDA pro to extract the features.
and provides PE file attributes, instruction to load PE in Multiple classification algorithms (j48, j48graft, LADTree,
memory, virtual address, section name, characteristics NBTree, Random Forest, REPTree) were used for analy-
and size of raw data, etc. sis and classification of PE files. It was highlighted by the

123
Complex & Intelligent Systems

researchers that the classification accuracy is based on the based mechanism over static features using unsupervised
model applied as well as disassembler chosen. In 2018, it has clustering for malware classification.
also been shown that machine learning model can learn from
sequence of raw bits without explicit feature extraction based
on conventional practices of malware classification [19]. Methodology and architecture
The use of machine learning-based classifiers for malware
intrusion detection is a well-known approach for network In this section, design of our proposed unsupervised frame-
analysis [25]. In addition to string extraction, researchers work, i.e., PROUD-MAL for classifying windows-based PE
[30] have also used statistical approach such as raw byte using clustering based on static analysis will be explained.
and byte entropy histogram. In [20], researchers presented The PROUD-MAL is a custom-built unsupervised frame-
an approach using static analysis of the features from the work composed of multiple modules including novel dataset
PE-Optional Header fields by employing Phi (φ) coefficient collection, dataset pre-processing and feature extraction and
and Chi-square (KHI2) score. In [23], features were extracted unsupervised clustering of the malicious & benign PE sam-
from system calls and submitted to neural network for classi- ples as illustrated in Figs. 1 and 2. Moreover, the designed
fication using 170 samples and obtained 0.96 for Area under Feature Attention-based Neural Network (FANN) is trained
Curve (AUC). In [24], experiments were performed to iden- over pseudo labels. The proposed classifier is evaluated over
tify the critical point to quarantine the activity of malicious the test dataset which was kept hidden during the testing as
code related to its communication with remote command and depicted in Fig. 3.
control server. Researchers [26] presented a framework that
ensures the protection of application programs against mal- Malware dataset acquisition
ware for mobile platform. In 2017, researchers used static
analysis to extract key information, i.e., headers strings and The first stage of the proposed framework is the indigenous
sequence from the metadata of PE files. The model was dataset collection. In this research work, a pilot attempt is
trained over a dataset of 4783 samples using Random Forest made to perform the dataset collection including the mal-
and achieved 96% accuracy. The researchers [42], designed a ware and benign samples which will be extended as future
malware classification method for several malware variants research work. A major obstacle in leveraging machine learn-
based on signature prediction. The proposed solution was ing techniques for malware analysis is the lack of sufficiently
based on the static analysis of features including strings, n- big, labelled datasets that shall contain the malicious as well
grams and hashes extracted from PE header. In [27], the as benign samples. Moreover, it is very important to keep
researchers proposed a malware detection system based on updating the malware dataset due to ever changing smart
supervised learning. They devised tool for feature extrac- evasion approaches adopted by malware authors. The collec-
tion from header of PE files. Later, system was trained using tion of malicious samples was difficult but the collection of
supervised machine learning classification algorithms such benign samples was also not easy. To this end, we used two
as Support Vector Machine and Decision Trees. In [47], (02) different approaches for collecting the malicious and
authors proposed Virtual Machine Introspection a machine benign samples as illustrated in Fig. 1. First, we deployed
learning-based approach for malware detection in virtual- low and high interaction honeypots as production unit and
ized environment. The researcher extracted opcodes using intentionally configured them in a vulnerable way to col-
static analysis and trained the classifier with selected fea- lect malicious files and log unauthorized behavior. The low
tures. Later, Term Frequency-Inverse Document Frequency interaction honeypots, i.e., Honeyd [34] as well as high inter-
(TF-IDF) and Information Gain (IG) were also applied as action honey pot, i.e., SMB Honey Pot [4] were deployed
classification algorithms. In 2019, researcher [29] proposed a over the enterprise organizational network to emulate the ser-
malware detection approach in the IoT environment based on vices frequently targeted by the attacker and the production
similarity hashing algorithm-based. In proposed technique, systems, respectively. Second, Kippo-Malware collector and
scores of binaries were calculated to identify the similar- Kaspersky endpoint security solution is also deployed over
ity between malicious PEs. Numerous hashing techniques the enterprise organizational network to collect malware as
[21] including PEHash, Imphash and Ssdeep were used. well as benign samples. The benign PE including.exe or.dll
Later, researchers integrated hash results using fuzzy logic. is also collected from machines with licensed and updated
Recently, attention models have shown promising output in version of Windows operating system including Windows
tasks such as image analysis, machine translation, computer XP, 7, 8 and 10. Special precautions have been taken into
vision and natural language processing [32]. The attention account for compliance of licensing and regulatory require-
mechanism supports the model to focus on the most relevant ments while collecting benign samples. Moreover, additional
features as required. Therefore, we employed the attention- precautionary measures such as establishment and config-
uration of sandbox environment for dataset collection and

123
Complex & Intelligent Systems

Fig. 1 Malware dataset collection and pre-processing

Fig. 2 Unsupervised framework


for windows-based PE malware
classification

further processing were also taken into consideration. We of corrupt and duplicate samples. The verification of samples
collected 19,000 samples (31 GB) over the period of seven and removal of duplicates were done using the hash values.
months (200 days) but after performing dataset verification, The dataset is divided into 60:20:20 ratios for training, vali-
samples were reduced to 15,457 (25 GB) PE samples com- dation and testing, respectively, of the proposed model.
prising 8775 (17 GB) malicious and 6681 (8 GB) samples.
The reduction in number of samples resulted due to filtering

123
Complex & Intelligent Systems

Fig. 3 PROUD-MAL validation

Data pre-processing and feature extraction nized in a csv compatible format. More than 35 features were
extracted and below is brief description of selected features.
To prepare the dataset, a series of pre-processing steps were
performed, i.e., identification of file type, removal of corrupt
and duplicate samples, unpacking of the binaries and verifi-
• MD5 is a cryptographic signature. It is a 32-bit hexadeci-
cation of labels to transform the raw data into a meaningful
mal value and each file has its unique MD5 value.
format. It was ensured that the dataset shall not contain any
• Machine represents the target machine such as Intel 386,
duplicate binaries using MD5. It was also ensured that only
MIPS little endian Motorola, etc.
unpacked binaries shall be submitted for feature extraction
• Size of optional header is a mandatory feature irrespective
therefore section names were examined using a tool PEStu-
of the name and provides information related to PE. It is
dio [45] to see if any of them contains popular packers [46]
included only for executable files and not for object files.
such as UPX, ASPack, FSG.
• Characteristics represent attributes of the file such as base
Moreover, verification of labels is a significant activity,
relocation address, local symbol, user program or system
which was performed by deploying signatures-based anti-
file, little-endian or big-endian architecture or whether file
virus solutions in parallel and finally using cloud-based
is DLL or not etc.
service of Virus Total. We used VirusTotal API [44] as well as
• Major/Minor Linker Version tells the linker to place a ver-
VirusTotal web interface to submit the binaries for verifica-
sion number in the header of the.dll or.exe file.
tion. The VirusTotal API does not require to web interface for
• Code Size represents size of code (text) section.
file submission It is pertinent to mention here that labelling of
• Size of Uninitialized Data is the size of data section.
samples in the dataset like text, images or speech is relatively
• Address of Entry Point is the address where the PE loader
an easy task, but the labelling as well as the verification of
will begin execution; this address is relative to image base
labels that whether a sample is benign or malicious was very
of the executable. It is the address of the initialization func-
time intensive task. Handling the malicious files needs extra
tion for device drivers and is optional for DLL.
precautionary measures such as establishment and configu-
• Base of Code is the pointer to beginning of the code section,
ration of sandbox environment. During the process, findings
which is relative to image base.
were observed such as existence of overlapping segments,
• Image Base is the preferred address of the first byte of the
usage of non-standard version details, names for sections and
executable when it is loaded in memory.
zero size of raw data that also results into high virtual size of
• Section Alignment: The address assignment to PE requires
section in case of packed PE files. It was also observed that
section loading. The section alignment is set to 0 × 2000.
some packers make an attempt to reduce entropy by embed-
This means that the code section starts at 0 × 2000 and the
ding zero bytes in data to bypass screening. Moreover, in
section after that starts at 0 × 4000.
malicious files, the data section is missing or has relatively
• File Alignment: Just like the section alignment, the data
lower value (if present) and permissions assigned to the sec-
also needs to be loaded. It is set to 512 bytes or 0 × 200.
tion are found to be inconsistent in comparison with standard
• Major/Minor Operating System Version is the version sup-
practices. It was also observed that resource size is relatively
ported by PE.
small as malicious files are mostly non-GUI. The study of
• Major Image Version is the major version number of
compilation time revealed that malwares are mostly compiled
image.
during off working days and also do not have genuine cre-
• Size of Image is the size of executable after being loaded
ation time. After the dataset preparation and pre-processing,
into memory. It must be multiple of section alignment.
feature extraction was performed. Features were extracted
• Size of headers represents the size of all headers, i.e., PE
by parsing headers of Portable Executables (PEs). A custom
header, the optional header, DOS header.
parser was developed to read PE headers, tokenization of
• Checksum is used for file validation at load time and to
features and their respective values. Finally, tokens are orga-
confirm whether a file is undamaged or has been corrupted.

123
Complex & Intelligent Systems

• Sub System This field points to user interface type required given an empirical validation regarding the appropriate selec-
by operating system. tion of numbers of clusters in a dataset that is also depicted
• Size of Stack Reserve is number of bytes allocated for by Fig. 4b. Nevertheless, if such information is not known in
stack and determines the stack region utilized by threads. advance, the applicability of other clustering algorithms, e.g.,
• Size of Stack Commit is the amount of memory that stack mean-shift [20] or unsupervised deep embedding [21], etc.
is relegated at startup. can be considered more appropriate. Therefore, the extracted
• Size of Heap represents the space to reserve for loading. features F  {f 1 , f 2 ,..., f N } are submitted to k-means algo-
• Loader Flags informs upon loading whether to break upon rithm that clusters the similar features (i.e., the corresponding
loading, debug on loading or to set to default. binaries). Using k-means allows us to obtain a set C  {c1 ,
• Number of RVA is the number of relative virtual addresses c2 ,..., ck } of k ( M) cluster centroids by keeping the fol-
in rest of the optional header. Each entry describes a loca- lowing optimization function at minimum:
tion and size. The structures contain critical information
about specific regions of the PE file. 
N 
M

• Load Configuration size is usually used for exceptions. It C ← argc min || f i − c j ||2 , (1)
is only utilized in Windows NT, 2000 and XP. i1 j1

• Section Minimum/Maximum/Mean Entropy value of spe-


cific file is represented using digital values and is used where cj  1, M represents the M cluster centroids. The k-
to check whether a file is packed or not. Higher entropy mean clustering iteratively optimizes an Euclidean objective
usually means that file is malicious. clustering with a self-training distribution to achieve pre-
dicted clusters. This progressive clustering is important to
PROUD-MAL refine the obtained pseudo label to optimize the model clas-
sification accuracy and subsequently convergence. It will also
The PROUD-MAL framework is a progressive unsupervised help to reduce the incorrect assignment which may is more
framework for malware classification based on static analysis vulnerable to get stuck in bad local optimum. Moreover, the
of executables. To this end, an architecture with two-phase visualization of clustering performance using silhouette anal-
cascaded formulation of unsupervised clustering with an ysis and elbow method is also illustrated in Fig. 4a and b,
attention-based deep neural network is proposed. As 80% respectively.
of dataset was unlabeled, therefore k-means clustering was
employed for prediction of pseudo labels. Subsequently, deep Classification using feature attention-based neural network
neural network was trained using pseudo labels by applying
attention over input features. The trained model was then To end of malware classification in PROUD-MAL frame-
tested over test dataset against the standard performance met- work, we designed a Feature Attention-based Neural Net-
ric. work (FANN) to learn the patterns within a dataset. The
FANN is designed to learn feature representation without
Unsupervised clustering ground-truth cluster membership labels and is trained over
pseudo labels. The pseudo labels are achieved using k-means
Clustering is a generally ubiquitous and widely accepted clustering which iteratively optimizes a Euclidean objective
instrument of classification for the categorization of data with function with a self-training distribution. The FANN com-
diversity of application domains including medical imag- prises of an input layer, output layer, Attention Block (AB)
ing, natural language processing, biotechnology and cyber and three hidden layers; illustrated in Fig. 5. All layers are
security, etc. It is used in the manner of data exploration densely connected. A feature vector is input to FANN and is
where objective function is to learn from data that is not fed forward through the densely connected layers. The first
well defined or understood [41]. Several algorithms are avail- hidden dense layer contains 38 neurons as equal to num-
able but in this research work, unsupervised clustering is ber of static feature while using rectified linear unit (ReLU)
performed by applying k-means algorithm with the motiva- as an activation function. The output of first hidden layer
tion for finding a representative, stable clustering solution, is propagated to the embedded AB. The proposed Atten-
which can be further utilized for classification as per the tion Block (AB) encodes contextual information by probing
framework architecture. The cluster prediction using unsu- feature weight and results in more refined representation by
pervised formulation, i.e., k-mean clustering is depicted in focusing on features of interest. The AB consists of two par-
Fig. 4b. Keeping in view the nature of the subject problem, allel attention networks /layers. Each network computes the
specifically, if the number of classes is known in advance, it attention for features subsequence of a PE instance and also
is intuitive to initialize the value of k equal to them. How- incorporates prior knowledge to predict new weights. The
ever, we still validated the value of k using the elbow method attention mechanism is discussed in more detail in follow-

123
Complex & Intelligent Systems

Fig. 4 a k-mean clusters of PE binaries. b Elbow score for cluster prediction

Fig. 5 Feature attention-based neural network (FANN)’s architecture

ing subsection. The third and fourth layers comprise of 13 lized as prior knowledge to train the model to predict
neurons each followed by output layer using sigmoid activa- classification of PEs. The sequence can be represented
tion for binary classification. The model is further fine-tuned as {F1 , F2 , . . . , Fn}. The weighted vector containing Wi
by adjusting the hyperparameters to achieve the optimum of each data point S i in feature combination sequence is
results. represented as {(F1 , W1 ), (F2 , W2 ), . . . , (Fn , Wn )}. Next,
we
  extract
  subsequence
  with  k highest weights:
F1 , W1 , F2 , W2 , . . . , Fk , Wk . As discussed earlier,
Attention mechanism the AB connects two parallel attention network/layers of
opposite directions to same output. Each network/layers
As we introduced above, PE header has numerous fea- computes the attn(i,h) for features of a PE instance given as
tures where some features might have a higher impact on input, where i represents features and h represents number of
identifying malicious PEs. Therefore, we employ attention units. One network processes sequence from top to bottom
mechanism to prioritize significance of important features (forwards) and other processes the sequence from bottom to
while penalizing the “noise” fields. The main principle top (backwards). Let xt denote current step of input sequence,
behind proposed Attention Block (AB) is as follows: The h t−1 denote previous hidden state. The next hidden state ht
selection of significant feature rather than examining entire can be calculated as follows:
feature set improves classification. To this end, a fea-
ture vector sequence of length n is extracted from PE h t  f (Axt + W h t−1 ), (2)
header. After processing feature vector at first iteration, sig-
nificant combination of length k is selected based upon where f is a non-linear activation function. A and W repre-
attention threshold. Subsequently, this subsequence is uti- sent weight matrices of current input vector xt and previous

123
Complex & Intelligent Systems

hidden state h t−1 . At each time step t, the forward pass cal- Table 2 Hyper-parameters and associated values
culates hidden state h t by considering previous hidden state Parameters Values
h t−1 and new input sequence xt . At the same time, back-
ward flow computes hidden state ht considering future hidden Batch size 32
state h t+1 and the current input xt . Afterward, the best out- Epochs 60
put among both forward h t and backward h t are selected to Dropout 0.5
obtain refined vector representation. The first network or set Learning rate 0.001
of layers in AB used sigmoid function while the other used Loss function Binary cross entropy
ReLU function. Finally, the best output is applied to feature Optimizer ADAM
importance map while taking the product of learned param- Momentum 0.9
eters with respective probabilities. As each layer computes
the attn(i,h) for features of a PE instance given as input, where
i represents features and h represents number of units. The Finally, the empirical validation of proposed PROUD-MAL
feature weights for first layer can be learned as Eq. (3). approach is also performed against standard metrics over test
dataset which is kept hidden during the training phase.
a(i,h)  σ (x(i,h) , W ), (3)

where x i is the input to layer and W denotes weights of layer Experiments and results
and σ represent sigmoid activation function to feature map
w (i, h) for the first attention layer in Eq. (4). Implementation details

b(i, h)  ∂(x(i,h) , W ). (4) This section narrates the configuration and performance
metric used for the experiment to classify Windows-based
Similarly,∂ represent ReLU activation function employed PE. The run time environment configured for experiments
by second attention layer to feature map w (i, h) followed by includes a workstation with Intel Core i5-9500 Processor
selecting maximum of Eqs. (3) and (4). @ 3.0 GHz with 6 cores and 6 logical processors, 32 GB
Ram, virtual memory of 20.0 GB with enabled virtualiza-
w(i, h)  max(a(i,h) , b(i,h) ) (5) tion, graphic card NVIDIA GeForce GTX 1650 with 4 GB
Ram and Window 10 Pro 64 operating system. In terms of
The attention is computed by multiplication of w (i, h) software, both Keras and Tensorflow were employed at back-
with output of sigmoid function as: end for implementation of our proposed framework. The
training is performed for 60 epochs (i.e., approx. 23,185 iter-
 
exp(w(i,h) ) ations) and input was submitted to network in a batch of 32
attn(i,h)  ⊗ w(i,h) . (6)
exp(w(i,h) ) + 1 with Adam as an optimizer and learning rate was initialized
with stepwise decay at 0.001 or 10–3 . Dropout regulariza-
The feature attention-based layer learns to put relatively tion of 0.5 is placed in after fully connected layers which
more to those features that have contributed more to minimize help to prevent overfitting. Generally, dropout removes neu-
the validation loss while learning the accurate classifica- rons and its connections randomly. Moreover, we adopted
tion by applying sigmoid function to the feature importance binary cross-entropy loss function, which is minimizing the
map and subsequently multiplying learned parameters with negative logarithmic likelihood between the prediction and
the respective probabilities. The dataset based on validated the ground-truth data. The momentum helps accelerating the
predicted clusters is splitted into 60:20:20 ratios for classi- ADAM in the relevant direction and mitigates oscillations by
fication training, validation, and testing, respectively. The adding a fraction of the update vector of the past time step to
model is trained over the predicted cluster dataset using the current update vector. The accuracy and loss parameters
classification algorithms including Random Forest, Support provided by Keras are visualized in better manner utilizing
Vector Machine (SVM), Gradient Boost, Ada Boost, Naive tensor board and console logs. The summary of hyper param-
Bayes and PROUD-MAL. The training is performed for 60 eters is provided in Table 2.
epochs (i.e., approx. 23,185 iterations) and the input was
submitted to the network in a batch of 32 with Adam as Results and discussion
an optimizer and learning rate was initialized with stepwise
decay at 0.001 or 10–3 . Binary cross entropy is utilized for We performed comparison of our proposed method with
loss calculation over the training data. After model training, state-of-the-art supervised approaches. Despite this challeng-
it is tested to make predictions against validation dataset. ing comparison, the utility of our proposed framework is

123
Complex & Intelligent Systems

Table 3 Quantitative assessment


of PROUD-MAL with Model Accuracy F1 Score Precision AUC TP FP FN TN
supervised approaches—results
Random Forest 94.27 95.06 98.71 98.78 0.91 0.09 0.03 0.97
SVM 93.01 94.07 97.61 97.40 0.87 0.13 0.02 0.98
Gradient Boost 56.71 72.38 88.28 90.99 0.0 1.0 0.0 1.0
Ada Boost 91.91 93.21 88.28 90.99 0.84 0.16 0.02 0.98
Naive Bayes 56.68 39.05 94.75 95.37 0.99 0.01 0.76 0.24
PROUD-MAL 98.09 98.33 97.30 99.55 0.98 0.02 0.02 0.98

Table 4 Quantitative assessment


of PROUD-MAL with Model Accuracy F1 Score Precision AUC TP FP FN TN
unsupervised approach—results
Hyrum et al. [48] 92.89 93.71 96.31 93.35 0.86 0.14 0.03 0.97
PROUD-MAL 98.09 98.33 97.30 99.55 0.98 0.02 0.02 0.98

well demonstrated by achieving high classification accuracy.


To perform model assessment in a quantitate fashion, we
used standard metrics of classification accuracy, F1 score,
precision, Receiver Operating Characteristic (ROC) curve,
area under the ROC curve (AUC), True Positive (TP), False
Positive (FP), True Negative (TN) and False Negative (FN)
Rate. The accuracy and other parameter results for PROUD-
MAL are illustrated in Tables 3, 4. In our experiments,
we considered results of Random Forest (RF) as a baseline
classification results for comparative study, due to its high
classification accuracy. However, other classification algo-
rithms are also employed for detailed comparison including
Support Vector Machine (SVM), Gradient Boost (GB), Ada
Boost (AB) and Naive Bayes (NB). The experiments include
testing of classifier over novel dataset which is kept hidden
during training phase against the standard evaluation met- Fig. 6 Validation and training accuracy as well as loss of PROUD-MAL
rics. Tables 3, 4 show quantitative results of comparative
analysis of PROUD-MAL and other classification algorithms
including Random Forest, Support Vector Machine, Gradi-
ent Boost, Ada Boost and Naive Bayes algorithms over the
collected dataset. It can be seen in Tables 3, 4 that the best per-
formance is achieved by PROUD-MAL with a classification
accuracy of 98.09%. The RF, SVM and AB also showed good
performance by achieving classification accuracy of 94.27%,
93.01% and 91.91%. However, GB and NB achieved lowest
classification accuracy, i.e., 56.71 and 56.68%, respectively.
A detailed analysis of confusion matrix shows that the
proposed PROUD-MAL framework with Feature Attention-
based Neural Network (FANN) demonstrated best classifica-
tion accuracy of 98.09% against standard evaluation metrics
on our indigenously collected novel dataset. However, RF,
SVM and AB also showed good performance by achiev- Fig. 7 ROC curve of PROUD-MAL
ing classification accuracy of 94.27%, 93.01% and 91.91%.
The GB and NB achieved the lowest classification accuracy,
i.e., 56.71 and 56.68%, respectively. It is worth mentioning sification models including random forest, SVM, Gradient
that PROUD-MAL achieved 4%, 5.46%, 72%, 6.72% and Boost, Ada Boost and Naive Bayes, respectively. More-
73% higher accuracy than classical machine learning clas- over, our experiments show that FANN demonstrated overall
higher AUC of 99.55% as compared to other classifiers which

123
Complex & Intelligent Systems

Fig. 8 t-SNE clusters


visualization of PEs

shows better predictive power and can also provide bet- acteristic (ROC) curve (Fig. 7) shows that our framework
ter sensitivity tuning. To the best of our knowledge, this is shows superior performance compared to other state-of-the-
due to unsupervised clustering cascaded by classifier with art supervised approaches. PROUD-MAL achieved ROC of
embedded attention layers. However, RF, SVM and NB also 0.99 with small discrepancy of 0.01. The visualization of
showed good performance by achieving AUC of 98.78%, cluster prediction is generated by applying t-SNE on the
97.40% and 95.37%. The GB and AB achieved relatively dataset and is depicted in Fig. 8. The blue dots represent the
lower AUC, i.e., 90.99%. The comparison with unsuper- malicious binaries and yellow mark represents the benign
vised approach [Hyrum S. Anderson et.al. 2018] also showed PEs. The visual exhibit reflects minor overlapping between
superior performance. Our approach demonstrated 5.2% high the malicious and benign samples. There were 38 features
classification accuracy. The detailed comparative assessment in vector space. However, by applying attention mechanism,
with supervised approaches as well as unsupervised one has it is revealed that features with the numerical values, e.g.,
shown utility and significance of the proposed architecture. section entropy, size of sections, image base were given
It is also pertinent to mention that for classification of an more weight by AB. On the other hand, the features that
unknown PE using an anti-virus software, the training time either represent unique numerical value or fixed length value
is not important because we can use pre-trained neural net- with a specific format, e.g., MD5, checksum are given rela-
work. As test time of FANN model is less than 21 ms per tively less weight than the normal numerical values such as
step, the model is appropriate for its subsequent utility in section entropy. But these attributes are given more consid-
real anti-virus software. eration in comparison with the features having string values,
Experiments show the True Positive (TP), False Posi- e.g., machine, characteristics, compiler etc. The proposed
tive (FP), True Negative (TN), False Negative (FN) rate for scheme of using feature subsequence combination by apply-
FANN is 0.98, 0.02, 0.98 and 0.02, respectively. The quan- ing attention mechanism resulted in more refined feature
titative assessment was conducted over 60 epochs with a representation. Subsequently, quantitative results of compar-
batch size of 32. The training-validation accuracy as well ative assessment have demonstrated the utility of attention
as training-validation loss is depicted in Fig. 6. The training mechanism for unsupervised classification of PEs using static
and validation graphs in Fig. 6 depict that PROUD-MAL features.
is trained quiet well enough around 60 epochs. We also
employed early stopping criteria to discontinue further train-
ing at an appropriate stage. It is worth mentioning that as we
Conclusion and future direction
increase the number of iterations, the loss or learning rate
descends gradually (not showing due to non-significance in
We have proposed and presented a progressive deep unsuper-
figure for later iterations.) Moreover, the graphs for training
vised malware classification framework, i.e., PROUD-MAL
and validation are also illustrated in Fig. 6. As our dataset
with a deep neural network architecture that uses dense layers
has 15,457 binaries comprising 8775 (17 GB) malicious and
and an attention block for binary classification of Windows-
6681 (8 GB) samples, therefore, we also calculated the area
based PEs based on features extracted from header in a static
under the ROC curve (AUC) as illustrated in Tables 3, 4,
fashion. Our proposed feature attention mechanism-based
which is a widely used performance metric for imbalanced
neural network for malware classification learns to put rela-
datasets. A visual inspection of the Receiver Operating Char-
tively more weights to those features that contributed more to

123
Complex & Intelligent Systems

minimize the validation loss while learning the accurate clas- 3. Gandotra E, Bansal D, Sofat S (2014) Malware analysis and clas-
sification. We also collected novel real-time malware dataset sification: a survey. J Inf Secur 5:56–64
4. Provos N (2004) A virtual honeypot framework. In: Proceedings
by deploying low and high interaction honeypots as well as of the 13th conference on USENIX Security Symposium
endpoint security solution on an enterprise organizational 5. Sung AH, Xu J, Chavez P, Mukkamala S (2004) Static analyzer
computer network for validation of proposed framework. of vicious executables (SAVE). In: Proceedings of the 20th annual
This indigenously collected dataset is novel and is made computer security applications conference
6. Wuchner T, Cisłak A, Ochoa M, Pretschner A (2020) Leverag-
public for the research community. We also look forward to ing compression-based graph mining for behavior-based malware
enhance existing volume of novel dataset. The quantitative detection. IEEE Trans Depend Secure Comput 16:1
assessment reflects that the proposed PROUD-MAL frame- 7. Ghafir I, Hammoudeh M, Prenosil V (2018) Defending against the
work achieved an accuracy of more than 98.09% with better advanced persistent threat: Detection of disguised executable files.
PeerJ Preprints 6:e2998. https://doi.org/10.7287/peerj.preprints.29
quantitative performance in standard evaluation metrics on
98v2
indigenously collected novel dataset and outperformed other 8. Alazab M, Layton R, Venkataraman S, Watters P (2010) Malware
conventional machine learning algorithms. As a way forward, detection based on structural and behavioral features of API calls,
our framework can be enhanced to explore the behavioral pg 1–10. In: International cyber resilience conference 2010—Perth
9. Devesa J, Santos I, Cantero X, Penya YK, Bringas PG (2010)
analysis based on API calls [49] using reinforcement learning
Automatic behavior-based analysis and classification system for
[50] for malware analysis. This includes the transformation malware detection. In ICEIS 2:395–399
of PEs into malware images and performs entropy based 10. Slam R, Tian R, Batten LM, Versteeg S (2013) Classification of
semantic segmentation of malware images. This will poten- malware based on integrated static and dynamic features. J Netw
Comput Appl 36(2):646–656
tially help malware authors to use malware visualization to 11. Egele M, Scholte T, Kirda E, Kruegel C (2012) A survey on
perform malware analysis more effectively for zero-day mal- automated dynamic malware-analysis techniques and tools. ACM
ware samples. The scope of future direction may also include Comput Surv 44:2
Non-Portable Executable (NPE) files. 12. Christodorescu M, Jha S (2003) Static analysis of executables to
detect malicious patterns. In: Proceedings of the 12th conference
on USENIX security symposium, vol 12, p 12
Acknowledgements The authors would like to express their gratitude
13. Ye Y, Wang D, Li T, Ye D (2007) IMDS: intelligent malware detec-
for research grant of Higher Education Commission (HEC) of Pakistan
tion system. In: Proceedings of ACM international conference on
under International Research Support Initiative Program (IRSIP) and
knowledge discovery and data mining (SIGKDD), pp 1043–1047
the institutional support from the Department of Computer Science,
14. Gandotra E, Bansal D, Sofat S (2014) Malware analysis and clas-
University of Warwick, Coventry, United Kingdom (UK).
sification: a survey. J Inf Secur 2:56–64
15. Schultz MG (2001) Data mining methods for detection of new mali-
cious executables. In: Proceedings of the IEEE symp. on security
Declarations and privacy, pp 38–49
16. Shabtai A, Moskovitch R, Elovici Y, Glezer C (2009) Detection of
Conflict of interest The authors declared that they have no conflict of malicious code by applying machine learning classifiers on static
interest. features: a state-of-the-art survey. Inf Secur Tech Rep 14:16–29
17. Eskandari M, Khorshidpour Z, Hashemi S (2013) Hdm-analyser:
Open Access This article is licensed under a Creative Commons a hybrid analysis approach based on data mining techniques for
Attribution 4.0 International License, which permits use, sharing, adap- malware detection. J Comput Virol Hack Tech 9(2):77–93
tation, distribution and reproduction in any medium or format, as 18. Khodamoradi P, Fazlali M, Mardukhi F, Nosrati M (2015) Heuris-
long as you give appropriate credit to the original author(s) and the tic metamorphic malware detection based on statistics of assembly
source, provide a link to the Creative Commons licence, and indi- instructions using classification algorithms. In: 18th CSI interna-
cate if changes were made. The images or other third party material tional symposium on computer architecture and digital systems
in this article are included in the article’s Creative Commons licence, (CADS), IEEE, pp 1–6
unless indicated otherwise in a credit line to the material. If material 19. Raff E, Sylvester J, Nicholas C (2017) Learning the PE header,
is not included in the article’s Creative Commons licence and your malware detection with minimal domain knowledge. In: Proc. 10th
intended use is not permitted by statutory regulation or exceeds the ACM workshop artificial intelligence secure, New York, ACM, pp
permitted use, you will need to obtain permission directly from the copy- 121–132
right holder. To view a copy of this licence, visit http://creativecomm 20. Belaoued M (2015) A real-time PE-malware detection system
ons.org/licenses/by/4.0/. based on CHI-Square test and pe-file features. In: IFIP interna-
tional conference on computer science and its applications CIIA
2015: computer science and its applications, pp 416–425
21. Pietrek M (1994) Peering inside the PE: a tour of the Win32 portable
executable file format
22. Rossow C et al (2012) Prudent practices for designing malware
References experiments: status quo and outlook. In: Proc. IEEE symp. secur.
privacy (SP), pp 65–79
1. Tang Y, Xiao B, Lu X (2011) Signature tree generation for poly- 23. Tobiyama S, Yamaguchi Y, Shimada H, Ikuse T, Yagi T (2016) Mal-
morphic worms. IEEE Trans Comput 60:565–579. https://doi.org/ ware detection with deep neural network using process behavior.
10.1109/TC.2010.130 In: Proc. IEEE 40th annu. comput. softw. appl. conf. (COMPSAC),
2. Internet security threat report (2019) https://www.symantec.com/ vol 2, pp 577–582
content/dam/symantec/docs/reports/istr-24-2019-en.pdf

123
Complex & Intelligent Systems

24. Shibahara T, Yagi T, Akiyama M, Chiba D, Yada T (2016) Efficient 42. Gandotra E, Bansal D, Sofat S (2016) Zero-day malware detection.
dynamic malware analysis based on network behavior using deep In: Sixth international symposium on embedded computing and
learning. In: Proc. IEEE Global Commun. Conf. (GLOBECOM), system design (IEEE, 2016), pp 171–175
pp 1–7 43. Ng CK, Jiang F, Zhang LY, Zhou W (2019) Static malware cluster-
25. Mutz D, Valeur F, Vigna G (2006) Anomalous system call detec- ing using enhanced deep embedding method. Concurr Computat
tion. ACM Trans Inf Syst Secur 9(1):61–93 Pract Exper 2019:e5234. https://doi.org/10.1002/cpe.5234
26. Zhauniarovich Y, Russello G, Conti M, Crispo B, Fernandes E 44. Algaith A, Gashi I, Sobesto B, Cukier M, Haxhijaha S, Bajrami
(2014) Moses: supporting and enforcing security profiles on smart- G (2016) comparing detection capabilities of antivirus products:
phones. Depend Secure Comput IEEE Trans 11(3):211–223 an empirical study with different versions of products from the
27. Raff E, Nicholas C (2017) An alternative to NCD for large same vendors. In:2016 46th Annual IEEE/IFIP international con-
sequences, Lempel-Ziv Jaccard distance. In: Proceedings of the ference on dependable systems and networks workshop (DSN-W),
23rd ACM SIGKDD international conference on knowledge dis- Toulouse, pp 48–53. https://doi.org/10.1109/DSN-W.2016.45
covery and data mining (ACM, 2017), pp 1007–1015 45. Kozachok AV, Kozachok VI (2018) Construction and evaluation
28. Rastogi V, Qu Z, McClurg J, Cao Y, Chen Y (2015) Uranine: real- of the new heuristic malware detection mechanism based on exe-
time privacy leakage monitoring without system modification for cutable files static analysis. J Comput Virol Hack Tech 14:225–231.
android. Springer Int. Pub, Cham, pp 256–276 https://doi.org/10.1007/s11416-017-0309-3
29. Namanya AP, Awan IU, Disso JP, Younas M (2019) Similarity hash- 46. Hassen M, Carvalho MM, Chan PK (2017) Malware classification
based scoring of portable executable files for efficient malware using static after validation analysis-based features. In: 2017 IEEE
detection in IoT. Fut Gen Comput Syst 110:824–832. https://doi. symposium series on computational intelligence (SSCI), Honolulu,
org/10.1016/j.future.2019.04.044 pp 1–7. https://doi.org/10.1109/SSCI.2017.8285426
30. Merkel R (2010) Statistical detection of malicious PE executables 47. Vadrevu P, Perdisci R (2016) MAXS: scaling malware execution
for fast offline analysis. In: Springer, Berlin, Heidelberg, ISBN with sequential multi-hypothesis testing. In: Proceedings of the
978-3-642-13241-4, pp 93–105. https://doi.org/10.1007/978-3-64 11th ACM on Asia conference on computer and communications
2-132414_10 security (ACM, 2016), pp 771–782
31. Catak FO, Yazı AF, Elezaj O, Ahmed J (2020) Deep learning based 48. Anderson HS, Kharkar A, Filar B, Evans D, Roth P (2018) Learning
Sequential model for malware analysis using Windows exe API to evade static PE machine learning malware models via reinforce-
Calls. PeerJ Comput Sci 6:e285. https://doi.org/10.7717/peerj-cs. ment learning. http://arxiv.org/abs/1801.08917
285 49. Wang Y, Stokes J, Marinescu M (2020) Actor critic deep reinforce-
32. Rush AM, Harvard SEAS, Chopra S, Weston J (2015) a neural ment learning for neural malware control. Proc AAAI Conf Artif
attention model for sentence summarization. In: Proceedings of the Intell 34(01):1005–1012
international conference on empirical methods in natural language 50. Wu C, Shi J, Yang Y, Li W (2018) Enhancing machine learning
processing, Lisbon, Protugal based malware detection model by reinforcement learning. In: Pro-
33. Collberg C, Thomborson C (2002) Watermarking, tamperproof- ceedings of the 8th international conference on communication and
ing, and obfuscation - tools for software protection. IEEE Trans network security (ICCNS 2018), pp 74–78. https://doi.org/10.114
Software Eng 28(8):735–746 5/3290480.3290494
34. Koniaris I, Papadimitriou G, Nicopolitidis P, Obaidat M (2014)
Honeypots deployment for the analysis and visualization of
malware activity and malicious connections. In: 2014 IEEE
Publisher’s Note Springer Nature remains neutral with regard to juris-
international conference on communications (ICC), Sydney, pp
dictional claims in published maps and institutional affiliations.
1819–1824. https://doi.org/10.1109/ICC.2014.6883587
35. Zhou Y, Jiang X (2012) Dissecting android malware: characteri-
zation and evolution. In IEEE symposium on security and privacy,
pp 95–109
36. Wan M, Shang W, Zeng P (2017) Double behavior characteristics
for one-class classification anomaly detection in networked control
systems. IEEE Trans Inf Forens Secur 12(12):3011–3023
37. Hearst M, Dumais S, Osman E, Platt J, Scholkopf B (1998) Support
vector machines. IEEE Intell Syst Appl 13(4):18–28. https://doi.
org/10.1109/5254.708428
38. Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2010)
RUSBoost: a hybrid approach to alleviating class imbalance. IEEE
Trans Syst Man Cybern A Syst Hum 40(1):185–197
39. Caruana A et al (2006) An empirical comparison of supervised
learning algorithms. In: ICML ’06 proceedings of the 23rd inter-
national conference on machine learning, pp 161–168
40. Kanungo T, Mount D, Netanyahu N, Piatko C, Silverman R, Wu
A (2002) An efficient K-means clustering algorithm analysis and
implementation. IEEE Trans Pattern Anal Mach Intell 24:881–892.
https://doi.org/10.1109/TPAMI.2002.1017616
41. Boutsidis C, Mahoney M, Drineas P (2009) Unsupervised fea-
ture selection for the k-means clustering problem. In: Advances
in neural information processing systems 22-proceedings of the
conference, pp 153–161

123

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy