0% found this document useful (0 votes)

9 views11 pages

Unified Approach For Android Malware Detection: Feature Combination and Ensemble Classifier

This document presents a unified approach for detecting Android malware using machine learning techniques, emphasizing the importance of feature combination and ensemble classifiers. The proposed model utilizes a dataset refined through SMOTE for class imbalance and employs a Voting Classifier with MLP, CatBoost, and XGBoost, achieving an impressive accuracy of 98%. The study highlights the critical need for adaptive security measures in the rapidly evolving Android ecosystem to combat the increasing threat of malware.

Uploaded by

Braulio Neto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views11 pages

Unified Approach For Android Malware Detection: Feature Combination and Ensemble Classifier

Uploaded by

Braulio Neto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Unified Approach for Android Malware Detection:

Feature Combination and Ensemble Classifier

Dr.V. Jyothsna1*, Kavya Priya Dasari2, Sravani Inuguru3, Venkat Bharath Reddy
Gowni4, Jaya Teja Reddy Kudumula5, K Srilakshmi6
1 Associate Prof., Dept of IT, Sree Vidyanikethan Engineering College, Tirupathi, India
*jyothsna1684@gmail.com
2,3,4,5 UG Scholar, Dept of IT, Sree Vidyanikethan Engineering College, Tirupathi, India
6Lecturer, Sri Padmavathi Women’s Degree & PG College, Tirupathi, India

dasaripriya653@gmail.com

Abstract. As the smartphone market has expanded enormously, particularly in the An-
droid environment, the necessity for robust anti-malware security has become increas-
ingly apparent. By harnessing the power of machine learning and large datasets, this
model demonstrates exceptional capabilities in identifying subtle malicious trends. This
study delves into the importance of coexistence in malware detection.This methodology
analyzes coexistence patterns crucial for effective malware detection and develops a da-
taset that integrates these key features. Addressing data imbalance using the SMOTE
technique enhances dataset representativeness. Feature selection via Extra Trees Classi-
fier optimizes pattern detection, improving classification precision. This methodology
significantly enhances cybersecurity in dynamic digital settings, detecting Android mal-
ware with high accuracy. The voting classifier (with MLP, CatBoost, and XGBoost)
trained on the above dataset achieved 98% accuracy. This work represents a substantial
advancement in efficient and adaptable malware detection techniques tailored for the
evolving Android ecosystem.

Keywords: Android, machine learning, malware, anomaly detection, feature

enhancement.

1 Introduction

The ever-evolving digital landscape poses a persistent challenge to cybersecurity de-

fenses due to the dynamic nature of malware. This study delves into analyzing evolving
malware trends and the need for adaptive defensive strategies.
Malware, ranging from viruses to trojans, continually advances to compromise sys-
tems and networks, highlighting the critical need for robust security measures. With
DataProt reporting 560,000 new malware types [1] daily and cybercrime costs projected
to exceed $10.5 trillion by 2025[2], the urgency for enhanced security is evident. The
Android ecosystem, with over 3.43 million apps on the Google Play Store [3], faces
security challenges exacerbated by third-party app markets lacking stringent monitor-
ing.

© The Author(s) 2024

K. R. Madhavi et al. (eds.), Proceedings of the International Conference on Computational Innovations and
Emerging Trends (ICCIET 2024), Advances in Computer Science Research 112,
https://doi.org/10.2991/978-94-6463-471-6_47
486 V. Jyothsna et al.

Statistical insights underscore the vast scope of the Android app ecosystem and the
prevalence of malicious applications [4], emphasizing the need for innovative security
approaches. This research aims to develop adaptive ML models capable of identifying
subtle malicious trends and enhancing detection accuracy to safeguard user data and
privacy.
Key enhancements in this paper include dataset refinement, addressing class im-bal-
ance using SMOTE, enriching datasets with dynamic features from frequent pat-terns,
and employing feature selection with the Extra Trees Classifier. The study also evalu-
ates the performance of diverse ML models individually and as an ensem-ble in a Vot-
ing Classifier, achieving impressive accuracy by incorporating binary coexistence fea-
tures derived from permission attributes.

2 Literature Review

M. E. Z. N. Kambar et al. [6] highlighted the proliferation of mobile applications due

to widespread smartphone usage and high-speed Internet access. Despite security en-
hancements in iOS and Android, there is a persistent rise in incursions targeting mobile
applications. Experts employ various techniques for detecting mobile malware, either
preemptively or through network traffic analysis, to mitigate associated risks. This doc-
ument offers insights into different types of mobile malware and their implications.

A. Alzubi et al. [7] introduce a novel approach to Android malware detection by com-
bining the Harris Hawks Optimization (HHO) algorithm with a Support Vector Ma-
chine (SVM) classifier. This method optimizes feature weighting and SVM hyperpa-
rameters, enhancing detection performance. Through rigorous testing on CIC-
Manal2017 datasets, the proposed technique demonstrates effectiveness in evaluating
feature importance and exploring correlations with malware attack types.

M. Li, Y. Wu, et al. [8] Because Android is open-source, it is increasingly vulnerable

to malware attacks, so efficient detection is essential. Modern advancements heavily
rely on machine learning, particularly in the classification stage of Android malware
detection. Examining the feature selection mode based on wrappers is vital since spe-
cific conventional ranking-based algorithms fail to consider feature relationships.
Wrapper-based methods, however, can take a long time to analyze different valid fea-
ture subsets when working with a large number of Android features.

Yadav P et al. [9] This study presents a two-step deep learning method that uses picture
representations of Android DEX files for Android malware identification and categori-
zation. The system uses EfficientNetB0 to extract information from color photos of
malware. As the meta-level classifier, logistic regression, random forest, and linear
SVM algorithms serve as base-level classifiers, a stacking classifier can achieve 100%
accuracy in binary classification and 92.9% in 5-class classification. The proposed
Unified Approach for Android Malware Detection 487

strategy outperforms 26 state-of-the-art pre-trained CNN models and large-scale learn-

ing classifiers on all performance metrics.

N. Sharma and A. L. Sangal et al. [12] This research addresses the surge in smartphone
Android malware threats, employing machine learning with the CICInvesAndMal2019
dataset. Using Android permissions and intents as features, Principal Component Anal-
ysis aids in feature selection. Among the machine learning models tested, Random For-
est proves the most effective, achieving a 99.7% success rate in binary classification
and 97.30% for the ransomware category in category classification.

Y. Kanchhal and S. Murugaanandam et al. [13] For more than a decade, Android has
remained the dominant mobile operating system worldwide. However, its widespread
usage has also attracted the attention of cybercriminals and malware developers, posing
significant security threats. Malware presents a universal challenge across all operating
systems, including Android. With Android's support for app installations from sources
beyond the Google Play Store, there is an increased risk of malware infiltration along-
side legitimate apps.

Jyothsna V. et al. [15] Applications for the Internet's technological advancements can
be found in many facets of daily life, including banking, public networking, online
commerce, and electronic trading. These services' exponential expansion raises net-
work traffic, increasing the possibility of network attacks. Scholars have put up several
approaches to deal with problems from decades ago. The research clarifies that machine
learning, artificial neural networks, and meta-heuristic approaches have been highly
regarded for their ability to handle security assaults. These approaches rely on the char-
acteristics of the requests made to extract knowledge. It has been noted that the network
traffic volume is growing exponentially, displaying diverse behavior and feature value
deviation. As a result, transaction associability and feature values must be considered.

Jyothsna V. et al. [16] Using neuroimaging data, deep neural networks can accurately
estimate the chronological age of healthy persons. Predicted brain age has the potential
to be used as a biomarker to detect illnesses associated with aging. Thus, the suggested
method (SVM) uses a Convolutional Neural Network (CNN), a deep learning cascade
network, and a Support Vector Machine (SVM), a machine learning algorithm. These
algorithms have identified three types of patients through brain MRI scan training: Nor-
mal (i.e., not impacted by any disease), Alzheimer's disease (AD), and Mild Cognitive
Impairment (MCI) from sorted pictures and established ages. The MRI image dataset
is trained for age estimation and classification using CNN and SVM methods.

3 Proposed Model

The methodology encompasses data collection from Drebin, Malgenome, and

CIC_MALDROID2020 datasets, culminating in the creation of the "lev2" dataset using
SMOTE for class imbalance handling. Feature extraction adopts a coexistence-based
488 V. Jyothsna et al.

strategy, while feature selection optimizes efficiency through the Extra Trees classifier.
A novel ensemble technique, the Voting Classifier, integrates MLP, CatBoost, and
XGBoost models, utilizing a "soft voting" approach for enhanced resilience. Compre-
hensive assessment metrics ensure a thorough evaluation of the ensemble model's per-
formance, contributing to a nuanced defense against evolving Android malware threats.
Figure 1 illustrates the system architecture overview.

Fig. 1. Methodology Workflow

3.1 Techniques and Algorithms

Smote. SMOTE is a vital technique in addressing class imbalance by generating syn-
thetic instances, ensuring fair representation for the minority class. It achieves a bal-
anced dataset distribution by creating artificial examples interpolated between existing
minority instances, reducing bias during model training towards the majority class. In-
tegrating SMOTE in dataset preprocessing enhances subsequent analyses, improving
overall model resilience and efficacy.

Pseudocode.

function SMOTE(sample, N1, K):

syntheticsamples = [⬚]
for i = 1 to N1:
randomsamples = randomlyselect(sample)
neighbors = findKnearest
neighbors

(randomsample , sample, K)syntheticsample = randomsample + randomuniform(⬚) ∗

(randomlyselect(neighbors) − randomsamples )
syntheticsamples . append(syntheticsamples )
return syntheticsamples

Frequent Itemset Mining using FP-Growth. The FP-Growth algorithm plays a cru-
cial role in mining frequent itemsets to enrich the dataset and unveil significant patterns.
It efficiently identifies recurring item sets from transactional data, aiding in understand-
ing associations among features. The balanced data obtained through SMOTE lays a
robust groundwork for FP-Growth to extract frequent item sets. This process entails
Unified Approach for Android Malware Detection 489

identifying frequently occurring feature combinations, offering valuable insights for

further analysis and modeling endeavors.

Pseudocode.
dataframe_new = empty DataFrame
for each item_set in enumeration of w:
conditions = None
for each feature in item_set:
conditions = (dataframe[feature] == 1) if conditions is None else conditions & (
dataframe[feature] == 1)
dataframe_new[′coexistence_′ + str(index)] = 1 if conditions else 0
dataframe_new[′class′] = dataframe[′class′]

Extra Trees Classifier. The Extra Trees classifier, a variant of decision tree algorithms
like Random Forest, excels in handling high-dimensional data and conducting feature
selection. It utilizes a meta-estimator that fits randomized decision trees on dataset sub-
sets, leveraging averaging to enhance accuracy and prevent overfitting. Unlike Random
Forest, Extra Trees selects the best split randomly from feature subsets, adding a layer
of randomization while optimizing performance.

Voting Classifier. In this study, the Voting Classifier model was employed as an inno-
vative approach to Android malware detection, harnessing the capabilities of Multi-
Layer Perceptron (MLP), XG Boost, and CatBoost classifiers.
The concept of "soft voting" in ensemble techniques entails that the final predic-
tion is not solely based on a majority vote but on the weighted average of predicted
probabilities from each base model. This means that the Voting Classifier considers the
confidence or certainty of predictions from each model and combines them accordingly.
This approach is particularly advantageous when working with models that provide
probability estimates, such as MLP, CatBoost, and XGBoost. By integrating these mod-
els within the Voting Classifier, the ensemble model aims to capitalize on their unique
strengths and patterns. MLP, known for its neural network architecture, is adept at cap-
turing intricate data relationships, while CatBoost, a gradient-boosting algorithm, ex-
cels in handling categorical features and mitigating overfitting.

3.2 Performance Evaluation Measures

Performance evaluation measures are essential for evaluating intrusion detection mod-
els. Metrics like Precision, Recall, and F1 Score offer valuable insights into the model's
performance. The confusion matrix provides a comprehensive overview of the model's
predictions compared to the actual ground truth.
In binary classification, True Positives (TP) are instances correctly identified as pos-
itive (e.g., correctly identifying malware), False Positives (FP) are instances incorrectly
identified as positive, True Negatives (TN) are instances correctly identified as nega-
tive, and False Negatives (FN) are instances incorrectly identified as negative. These
metrics provide a comprehensive assessment of the model's performance, highlighting
490 V. Jyothsna et al.

its ability to distinguish between positive and negative classes accurately.

Accuracy. Used to evaluate the classification model's overall correctness. Although

class imbalances in the dataset may impact on this metric's applicability, it offers a
broad indication of model performance.

Recall (Sensitivity or True Positive Rate). The proportion of accurate positive pre-
dictions among all actual positives, measures the model's ability to detect positive in-
stances.

Precision. It determines the proportion of accurate positive predictions among all pos-
itive predictions. Precision quantifies the degree to which the model's positive predic-
tions are accurate.

F1 Score. The harmonic means of precision and recall. It provides a balanced measure
of the model's precision and recall.
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑇𝑟𝑢𝑒𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠)/𝑇𝑜𝑡𝑎𝑙𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠
𝑅𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 / (𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝐹𝑎𝑙𝑠𝑒𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 / (𝑇𝑟𝑢𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠)
𝐹1 𝑆𝑐𝑜𝑟𝑒 = 2 ∗ (𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙) / (𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙)

An ideal intrusion detection model should effectively balance recollection and preci-
sion., as it minimizes the number of false positives while still being able to detect most
instances of unauthorized transactions.

4 Proposed Model

4.1 Dataset

The Malgenome dataset encompasses 3,798 unique programs observed from 2012 to
the end of 2015. It comprises 1,260 programs attributed to 49 distinct malware families
and 2,538 clean applications. Initially, 181 features were extracted from this dataset,
including 109 permissions and 72 APIs. Three distinct subsets were delineated from
the Malgenome dataset: “API + Permission combination”, “only Permission Mal-
genome”, and “only API Malgenome”. This dataset contains a total of 3799 rows and
182 columns.

4.2 Data Preprocessing

Class Imbalance Handling Strategy. In the above Api + Permission combination sub-
set of the Malgenome dataset, to address the class imbalance, synthetic instances of the
minority class (class 1) were created using SMOTE to match the number of instances
in the majority class (class 0). This balancing strategy ensured that machine learning
Unified Approach for Android Malware Detection 491

models are not biased toward the majority class and could be generalized effectively to
both classes.
This approach contributes to enhancing the performance of machine learning models
and improving the accuracy of malware detection. After addressing the class imbalance,
the number of instances for both class 0 and class 1 is 1260.

Coexistence features. After mitigating class imbalance using the SMOTE technique,
the next step involved extracting frequent item sets using the FP-Growth algorithm.
This process identified recurring patterns in the dataset, crucial for understanding co-
existence relationships among features. Subsequently, a new dataset named "lev2" was
created, capturing the bi, tri, and ternary features derived from the extracted frequent
pattern item sets.

Feature Selection and Model Training. The top-performing features from the lev2
dataset are selected using the Extra Trees Classifier, resulting in 795 relevant features
out of 6487. This strategic integration enhances malware detection models' efficacy by
prioritizing key data aspects.
Three individual models are trained on these selected features: MLP with a single
hidden layer of 100 neurons and 1000 iterations, CatBoost with 100 iterations, depth of
8, and a learning rate of 0.1 using the MultiClass loss function, and XGBoost with 100
estimators, maximum depth of 8, and a learning rate of 0.1.
Furthermore, a Voting Classifier is developed by combining MLP, CatBoost, and
XGBoost using a soft voting strategy based on confidence levels. This ensemble model
leverages each model's strengths to improve overall predictive performance.

Performance Comparison. The ensemble method demonstrates positive synergistic

effects, improving overall malware detection performance compared to individual al-
gorithms. Figure 2 displays the voting classifier's confusion matrix. Table 2 and Figure
3 shows the classifier overall performance comparison.

Fig. 4 The malware detection methodology begins with a dataset of 2520 entries. Sev-
eral classifiers, including MLP, CatBoost, and a Voting Classifier, are employed for a
thorough analysis. The dataset is partitioned into folds of roughly comparable size, with
each fold containing approximately 504 records in 5-fold cross-validation. During each
iteration, 2016 records are used for training, while onefold (504 records) is reserved for
testing. Figure 4 depicts the average accuracy across all folds and presents accuracy
scores for each fold individually. The malware detection system employed a probability
threshold of 0.5 to determine the presence or absence of malware in instances.
492 V. Jyothsna et al.

Fig. 2. Overall Confusion Matrix for Voting Classifier

Model MLP CatBoost XG Boost Voting Classifier

True Positive 1245 1220 1205 1225

False Positive 20 32 50 17

True Negative 1227 1232 1210 1257

False Negative 28 48 55 21

Precision 98.7% 97.4% 98.6% 98.6%

Recall/Sensitivity 97.7% 96.21% 97.7% 98.3%

Accuracy 98% 97.3% 97.8% 98.4%

F1 score 98.04% 96.80% 97% 98.4%

Table 1. Performance Matrix

Fig. 3. Overall Performance Comparison of Classifiers

Unified Approach for Android Malware Detection 493

Fig. 4. Model Accuracy Across Folds

5 Conclusion

The proposed model marks a significant leap in Android malware detection, showcas-
ing heightened accuracy and adaptability within the Android ecosystem. Utilizing ex-
tensive datasets like Drebin and Malgenome, coupled with advanced algorithms and
optimization techniques such as SMOTE for handling imbalanced data, has markedly
improved the model's effectiveness.
A standout feature of the model is its incorporation of ensemble techniques, notably
the voting classifier, which capitalizes on the strengths of MLP, XGBoost, and Cat-
Boost, resulting in superior accuracy in identifying malware. Moreover, the model's
versatility within the dynamic Android environment is a pivotal advantage, providing
a sturdy defense against evolving malicious strategies. Through the amalgamation of
feature-rich datasets and ensemble strategies, the approach not only delivers heightened
accuracy but also lays a robust foundation for ongoing advancements in Android mal-
ware detection.

5.1 Future Work

Future work for this paper involves evaluating the coexistence approach with dynamic
features and expanding the analysis of API and permission combinations to cover a
wider range of malware datasets. Additionally, exploring advanced optimization tech-
niques for machine learning models in Android malware detection is a priority. This
dual focus aims to maintain the approach's effectiveness while adapting it to diverse
malware scenarios, ensuring robustness and accuracy in identifying malicious trends
across various datasets. Furthermore, a shift towards dynamic malware analysis and
utilizing versatile datasets beyond API and permission combinations are recommended
for a more comprehensive analysis of Android malware. These avenues of future work
aim to contribute to the continuous advancement of Android malware detection tech-
niques and enhance cybersecurity measures for mobile users globally.
494 V. Jyothsna et al.

References
[1] B. Jovanovic, A Not-So-Common Cold: Malware Statistics in 2023, May 2023, [online] Available:
https://dataprot.net/statistics/malware-statistics/.
[2] Cyber Security Statistics The Ultimate List Of Stats Data & Trends for 2023, May 2023, [online]
Available: https://purplesec.us/resources/cyber-security-statistics/.
[3] M. Iqbal. (2022). App Download Data. Accessed: Oct. 30, 2022.[Online].Available:
https://www.businessofapps.com/d
ata/app-statistics/
[4] K. Allix, T. Bissyand, Q. Jarome, J. Klein, R. State, and Y. L. Traon, ‘‘Empirical assessment of machine
learning-based malware detectors for android,’’ Empirical Softw. Eng., vol. 21, pp. 183–211, Jun. 2016
[5] Esraa Odat; Qussai M. Yaseen, “A Novel MachineLearning Approach for Android Malware Detection
Based on the Co-Existence of Features" Feb 2023, doi: 10.1109/ACCESS.2023.3244656
[6] M. E. Z. N. Kambar, A. Esmaeilzadeh, Y. Kim, and K. Taghva, ‘‘A survey on mobile malware detection
methods using machine learning,’’ in Proc. IEEE 12th Annu.Comput. Commun. Workshop Conf.
(CCWC), Jan. 2022, pp. 0215–0221, doi: 10.1109/CCWC54503.2022.9720753.
[7] O. A. Alzubi, J. A. Alzubi, A. M. Al-Zoubi, M. A. Hassonah, and U. Kose, ‘‘An efficient malware
detection approach with feature weighting based on Harrishawks optimization,’’ Cluster Comput., vol.
25, no. 4, pp. 2369–2387, Aug. 2022, doi: 10.1007/s10586-021-03459-1.
[8] Y. Wu, M. Li, Q. Zeng, T. Yang, J. Wang, Z. Fang, and L. Cheng, ‘‘DroidRL: Feature selection for
Android malware detection with reinforcement learning,’’ Comput. Secure., vol. 128, May 2023, Art.
no. 103126, doi: 10.1016/j.cose.2023.103126.
[9] Avanija, J., K. E. Kumar, Ch Usha Kumari, G. Naga Jyothi, K. Srujan Raju, and K. Reddy Madhavi.
"Enhancing Network Forensic and Deep Learning Mechanism for Internet of Things Networks."
(2023).
[10] J. Kim, Y. Ban, E. Ko, H. Cho, and J. H. Yi,‘‘MAPAS: A practical deep learning-based Android
malware detection system,’’ Int. J. Inf. Secur., vol. 21, no. 4, pp. 725–738, Aug. 2022, doi:
10.1007/s10207-022-00579-6.
[11] S. Fallah and A. J. Bidgoly, ‘‘Android malware detection using network traffic based on sequential
deep learning models,’’ Softw., Pract. Exper., vol. 52, no. 9, pp. 1987–2004, Sep. 2022, doi:
10.1002/spe.3112.
[12] N. Sharma and A. L. Sangal, ‘‘Machine learning approaches for analysing static features in Android
malware detection,’’ in Proc. 3rd Int. Conf. Secure Cyber Comput. Commun. (ICSCCC), Jalandhar,
India, May 2023, pp. 93–96, doi: 10.1109/ICSCCC58608.2023.10176445.
[13] Kumar, DNS Ravi, N. Praveen, Hari Hara P. Kumar, Ganganagunta Srinivas, and M. V. Raju. "Acoustic
Feedback Noise Cancellation in Hearing Aids Using Adaptive Filter." International Journal of
Integrated Engineering 14, no. 7 (2022): 45-55.
[14] E. C. Bayazit, O. K. Sahingoz, and B. Dogan, ‘‘Malware detection in Android systems with traditional
machine learning models: A survey,’’ in Proc. Int. Congr. Human-Comput. Interact., Optim. Robotic
Appl. (HORA), Jun. 2020, pp. 1–8, doi: 10.1109/HORA49412.2020.9152840.
[15] Jyothsna, V., Prasad, M. K., GopiChand, G., & Bhavani, D. D. (2022). DLMHS: Flow-based intrusion
detection system using deep learning neural network and meta-heuristic scale. International Journal Of
Communication Systems, 35(10).
[16] Jyothsna, V., Raja, D. K., Kumar, G. H., & Chnadra, E. D. (2022). A novel manifold approach for
intrusion detection system (MHIDS). Gongcheng Kexue Yu Jishu/Advanced Engineering Science,
54(02).
Unified Approach for Android Malware Detection 495

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-
NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/),
which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any
medium or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's
Creative Commons license, unless indicated otherwise in a credit line to the material. If material
is not included in the chapter's Creative Commons license and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder.

PermDroid - A Framework For Android Malware Detection
No ratings yet
PermDroid - A Framework For Android Malware Detection
38 pages
Hybrid Machine Learning Model For Malware Analysis in
No ratings yet
Hybrid Machine Learning Model For Malware Analysis in
18 pages
Second
No ratings yet
Second
21 pages
Machine Learning Based Ensemble Classifier For Android Malware Detection
No ratings yet
Machine Learning Based Ensemble Classifier For Android Malware Detection
18 pages
COSREV D 24 00138 - Reviewer
No ratings yet
COSREV D 24 00138 - Reviewer
25 pages
Android Malware Classification Using Convolutional Neural Network and LSTM
No ratings yet
Android Malware Classification Using Convolutional Neural Network and LSTM
12 pages
An Adversarial Machine Learning Model Against Android Malware Evasion Attacks
No ratings yet
An Adversarial Machine Learning Model Against Android Malware Evasion Attacks
13 pages
Resposta As Seitas Norman Geisler e Ron Rhodes PDF
100% (2)
Resposta As Seitas Norman Geisler e Ron Rhodes PDF
293 pages
Malware Analysis and Detection Using Machine Learning Algorithm
No ratings yet
Malware Analysis and Detection Using Machine Learning Algorithm
4 pages
Information 15 00025
No ratings yet
Information 15 00025
25 pages
IEEE Xplore Citation Plain Text Download 2025.1.5.19.3.25
No ratings yet
IEEE Xplore Citation Plain Text Download 2025.1.5.19.3.25
3 pages
Android Malware Detection Report
No ratings yet
Android Malware Detection Report
9 pages
An Efficient Android Malware Detection Using Adaptive Red Fox Optimization Based CNN
No ratings yet
An Efficient Android Malware Detection Using Adaptive Red Fox Optimization Based CNN
22 pages
A Vast Review of Recognizing The Presence of Andro
No ratings yet
A Vast Review of Recognizing The Presence of Andro
17 pages
Wireless Security 1
No ratings yet
Wireless Security 1
16 pages
Towards A Fair Comparison and Realistic Evaluation Framework of Android Malware
No ratings yet
Towards A Fair Comparison and Realistic Evaluation Framework of Android Malware
18 pages
Electronics 09 00435 PDF
No ratings yet
Electronics 09 00435 PDF
20 pages
A Comprehensive Survey On Machine Learning Techniques For Android Malware Detection
No ratings yet
A Comprehensive Survey On Machine Learning Techniques For Android Malware Detection
12 pages
Thesis Topic:: Smart Tool For Analysing, Classifying and Detection Malware Using Machine Learning and Deep Learning
No ratings yet
Thesis Topic:: Smart Tool For Analysing, Classifying and Detection Malware Using Machine Learning and Deep Learning
26 pages
Masum 2019
No ratings yet
Masum 2019
5 pages
PDF 1
No ratings yet
PDF 1
22 pages
A Performance-Sensitive Malware Detection System Using Deep Learning On Mobile Devices
No ratings yet
A Performance-Sensitive Malware Detection System Using Deep Learning On Mobile Devices
16 pages
Android Based Malware Detection Technique Using Machine Learning Algorithms
No ratings yet
Android Based Malware Detection Technique Using Machine Learning Algorithms
6 pages
Final Research
No ratings yet
Final Research
12 pages
Android Malware
No ratings yet
Android Malware
62 pages
7SR51 Numerical Relay Datasheet
100% (3)
7SR51 Numerical Relay Datasheet
33 pages
Agrawal-Trivedi2021 Chapter MachineLearningClassifiersForA
No ratings yet
Agrawal-Trivedi2021 Chapter MachineLearningClassifiersForA
13 pages
A Survey On Android Malware Detection Techniques Using Supervised Machine Learning
No ratings yet
A Survey On Android Malware Detection Techniques Using Supervised Machine Learning
24 pages
Feature Engineering and Evaluation For Android Malware Detection Scheme
No ratings yet
Feature Engineering and Evaluation For Android Malware Detection Scheme
18 pages
Mining Based Learning Framework For Android Malware Detection
No ratings yet
Mining Based Learning Framework For Android Malware Detection
12 pages
1 s2.0 S0957417424024138 Main
No ratings yet
1 s2.0 S0957417424024138 Main
16 pages
V25I0107
No ratings yet
V25I0107
6 pages
A Survey On Android Malware Detection Techniques Using Machine Learning Algorithms
No ratings yet
A Survey On Android Malware Detection Techniques Using Machine Learning Algorithms
8 pages
A Survey of Android Malware Detection With Deep Neural Models
No ratings yet
A Survey of Android Malware Detection With Deep Neural Models
36 pages
Machine Learning Based Ensemble Classifier For Android Malware Detection
No ratings yet
Machine Learning Based Ensemble Classifier For Android Malware Detection
18 pages
Android Malware Detection Using Machine Learning Techniques
No ratings yet
Android Malware Detection Using Machine Learning Techniques
50 pages
Hybrid ML-DL Approach For Android Malware Detection
No ratings yet
Hybrid ML-DL Approach For Android Malware Detection
9 pages
TSP Csse 52875
No ratings yet
TSP Csse 52875
21 pages
DEF: Deep Ensemble Neural Network Classifier For Android Malware Detection
No ratings yet
DEF: Deep Ensemble Neural Network Classifier For Android Malware Detection
11 pages
Leveraging Fine-Tuned LightGBM For Advanced AI-Driven Android Malware Detection-1
No ratings yet
Leveraging Fine-Tuned LightGBM For Advanced AI-Driven Android Malware Detection-1
8 pages
IJCRT2405073
No ratings yet
IJCRT2405073
3 pages
Malware Detection in Android in Different Application Categories
No ratings yet
Malware Detection in Android in Different Application Categories
6 pages
Android Malware Detection
No ratings yet
Android Malware Detection
15 pages
TSP CMC 53163
No ratings yet
TSP CMC 53163
18 pages
Mathematics 09 02880 v2
No ratings yet
Mathematics 09 02880 v2
18 pages
Network Malware Detection Using Deep Learning Netw
No ratings yet
Network Malware Detection Using Deep Learning Netw
26 pages
Malware Detection Using ML
No ratings yet
Malware Detection Using ML
19 pages
PDF 4
No ratings yet
PDF 4
11 pages
Odusami2018 Chapter AndroidMalwareDetectionASurvey
No ratings yet
Odusami2018 Chapter AndroidMalwareDetectionASurvey
12 pages
Hybrid Android Malware Detection and Classification Using Deep Neural Networks
No ratings yet
Hybrid Android Malware Detection and Classification Using Deep Neural Networks
26 pages
7.analysis and Detection of Malware in Android Applications Using Machine Learning
No ratings yet
7.analysis and Detection of Malware in Android Applications Using Machine Learning
55 pages
Sensors: Deep Feature Extraction and Classification of Android Malware Images
No ratings yet
Sensors: Deep Feature Extraction and Classification of Android Malware Images
29 pages
IEEE Xplore Citation Plain Text Download 2025.1.5.19.1.38
No ratings yet
IEEE Xplore Citation Plain Text Download 2025.1.5.19.1.38
9 pages
Malware - Detection - Research - Paper - Updated Soheb6
No ratings yet
Malware - Detection - Research - Paper - Updated Soheb6
8 pages
Malware Detection Research Paper Updated Soheb6
No ratings yet
Malware Detection Research Paper Updated Soheb6
6 pages
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
No ratings yet
The Curious Case of Machine Learning in Malware Detection: Sherif Saad, William Briguglio and Haytham Elmiligi
9 pages
Droiddetector: Android Malware Characterization and Detection Using Deep Learning
No ratings yet
Droiddetector: Android Malware Characterization and Detection Using Deep Learning
10 pages
Significant Permission Identification For Machine Learning Based Android Malware Detection
No ratings yet
Significant Permission Identification For Machine Learning Based Android Malware Detection
10 pages
P.E.S. College of Engineering, MANDYA, 571401: Identifying The Android Malware Using Machine Learning Algorithm
No ratings yet
P.E.S. College of Engineering, MANDYA, 571401: Identifying The Android Malware Using Machine Learning Algorithm
34 pages
Apple Device Support Exam Prep Guide
No ratings yet
Apple Device Support Exam Prep Guide
20 pages
Improved Chimp Optimization Algorithm (ICOA) Feature Selection and Deep Neural Network Framework For Internet of Things (IOT) Based Android Malware Detection
No ratings yet
Improved Chimp Optimization Algorithm (ICOA) Feature Selection and Deep Neural Network Framework For Internet of Things (IOT) Based Android Malware Detection
8 pages
DBMS MCQ
No ratings yet
DBMS MCQ
17 pages
The Agile With Scrum Users Guide PDF
No ratings yet
The Agile With Scrum Users Guide PDF
103 pages
Chapter 1-1.1
No ratings yet
Chapter 1-1.1
22 pages
Scribd-Caia Level 1
0% (3)
Scribd-Caia Level 1
3 pages
Apple 820-3588-A
No ratings yet
Apple 820-3588-A
86 pages
Final Year Project Review
No ratings yet
Final Year Project Review
25 pages
Associate in Computer Technology
No ratings yet
Associate in Computer Technology
1 page
Tiktok Auto
No ratings yet
Tiktok Auto
36 pages
Vacation Budgeting Project
No ratings yet
Vacation Budgeting Project
2 pages
Escort FLS Manual
No ratings yet
Escort FLS Manual
111 pages
Networker Errors
No ratings yet
Networker Errors
230 pages
Module Framework R1
No ratings yet
Module Framework R1
16 pages
Leica GS18 I DS 900756 0422 en LR
No ratings yet
Leica GS18 I DS 900756 0422 en LR
2 pages
The Magic Cafe Forums - Red Streamlined Convertible by David Regal
No ratings yet
The Magic Cafe Forums - Red Streamlined Convertible by David Regal
3 pages
Log Saida
No ratings yet
Log Saida
176 pages
Project Proposal
No ratings yet
Project Proposal
8 pages
Btcoe704 (Ach51 1674276896 BT QP
No ratings yet
Btcoe704 (Ach51 1674276896 BT QP
1 page
Sem 2 Synopsis
No ratings yet
Sem 2 Synopsis
27 pages
3-28.OSB2B05 Traffic Statistics
No ratings yet
3-28.OSB2B05 Traffic Statistics
34 pages
Draeger MSI Compact
No ratings yet
Draeger MSI Compact
2 pages
Installation & Basic Operations: Medcaptain Service Dept
No ratings yet
Installation & Basic Operations: Medcaptain Service Dept
24 pages
711 - ACV - Computer Hardware Assembly & Maintenance - 21-05-2024
No ratings yet
711 - ACV - Computer Hardware Assembly & Maintenance - 21-05-2024
8 pages
Cluster Analysis: DSCI 5240 Data Mining and Machine Learning For Business
No ratings yet
Cluster Analysis: DSCI 5240 Data Mining and Machine Learning For Business
44 pages
CV 2024111621095793
No ratings yet
CV 2024111621095793
2 pages
Ramesh Yadav Resume
No ratings yet
Ramesh Yadav Resume
3 pages
ISU Master Data V0.7
No ratings yet
ISU Master Data V0.7
28 pages
To A Successful Game: 8 Elements
No ratings yet
To A Successful Game: 8 Elements
28 pages
SQL Lab 3
No ratings yet
SQL Lab 3
8 pages
Ext Proposal Format
No ratings yet
Ext Proposal Format
2 pages
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
No ratings yet
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR
3 pages
Department of Education: Republic of The Philippines
No ratings yet
Department of Education: Republic of The Philippines
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unified Approach For Android Malware Detection: Feature Combination and Ensemble Classifier

Uploaded by

Unified Approach For Android Malware Detection: Feature Combination and Ensemble Classifier

Uploaded by

Unified Approach for Android Malware Detection:

Feature Combination and Ensemble Classifier

Keywords: Android, machine learning, malware, anomaly detection, feature

The ever-evolving digital landscape poses a persistent challenge to cybersecurity de-

© The Author(s) 2024

M. E. Z. N. Kambar et al. [6] highlighted the proliferation of mobile applications due

M. Li, Y. Wu, et al. [8] Because Android is open-source, it is increasingly vulnerable

strategy outperforms 26 state-of-the-art pre-trained CNN models and large-scale learn-

The methodology encompasses data collection from Drebin, Malgenome, and

Fig. 1. Methodology Workflow

3.1 Techniques and Algorithms

function SMOTE(sample, N1, K):

(randomsample , sample, K)syntheticsample = randomsample + randomuniform(⬚) ∗

identifying frequently occurring feature combinations, offering valuable insights for

3.2 Performance Evaluation Measures

its ability to distinguish between positive and negative classes accurately.

Accuracy. Used to evaluate the classification model's overall correctness. Although

4.2 Data Preprocessing

Performance Comparison. The ensemble method demonstrates positive synergistic

Fig. 2. Overall Confusion Matrix for Voting Classifier

Model MLP CatBoost XG Boost Voting Classifier

True Positive 1245 1220 1205 1225

True Negative 1227 1232 1210 1257

Precision 98.7% 97.4% 98.6% 98.6%

Recall/Sensitivity 97.7% 96.21% 97.7% 98.3%

Accuracy 98% 97.3% 97.8% 98.4%

F1 score 98.04% 96.80% 97% 98.4%

Table 1. Performance Matrix

Fig. 3. Overall Performance Comparison of Classifiers

Fig. 4. Model Accuracy Across Folds

5.1 Future Work

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.