Odusami2018 Chapter AndroidMalwareDetectionASurvey
Odusami2018 Chapter AndroidMalwareDetectionASurvey
Abstract. In the world today, smartphones are evolving every day and with
this evolution, security becomes a big issue. Security is an important aspect of
the human existence and in a world, with inadequate security, it becomes an
issue for the safety of the smartphone users. One of the biggest security threats
to smartphones is the issue of malware. The study carried out a survey on
malware detection techniques towards identifying gaps, and to provide the basis
for improving and effective measure for unknown android malware. The results
showed that machine learning is a more promising approach with higher
detection accuracy. Upcoming researchers should look into deep learning
approach with the use of a large dataset in order to achieve a better accuracy.
1 Introduction
The mobile device has undeniably become a new growing trend in this modern age as
many internet applications have migrated their products, accessibility, and applications
to this platform for improved productivity and interoperability. The growth of mobile
devices is driving a circumvolution change in our information security [1]. However,
the growth of mobile device has increased the associated threat some of which are:
SMS spam threat [2], phishing, malware, license to kill spyware, etc. Android Oper-
ating System (OS) platform has become the fastest growing mobile OS based on its
open nature thus making it the most preferred OS for many consumers and developers
[3]. This OS has, however, allowed the operations of several thousand of applications
from different market hence easy users’ functionality [4]. The advantages of
Android OS over other mobile Operating System include the following: it runs very
powerful applications, it is very flexible and friendly as it allows users to make their
choice of applications [5]. Android phones have been an attraction to several illegiti-
mate operations because of its popularity and increasing openness. An attacker can
easily incorporate its own code into the code of a normal application. Hence, malware
infiltrating the android application is growing at a dangerous rate and under this sit-
uation, security of the devices and the assets these devices allow access to is very
crucial [6]. To address this security issue, researchers have established several methods
for the detection of Android malware to prevent Android devices against security
breaches. Numerous approaches with diverse aims and objectives had been widely
utilized in bringing out their strengths and weaknesses. Evaluation of the techniques is
done based on true positive, true negative, false positive, false negative, precision,
accuracy, f-score etc. The aim of this study is to see the trend in Android malware
detection and suggests an improved method by reviewing, categorizing and comparing
existing works.
Malicious operations are often referred to as the accessing of users private infor-
mation by stealing, spying and displaying of the undesirable advertisement [6]. The
umbrella term for these malicious operations is Malware. Malware is derived from
malicious software and it is often referred to as software program that consciously
possesses the deep attributes of malicious attackers and characterizes by its malicious
aim [7]. Different types of Malware are shown in Fig. 1 based on their diverse purposes
and ways of penetration.
TROJAN
KEYLOGGERS
HORSES
MALWARE
RANSOMWARE WORMS
BOTS ROOTKITS
The remaining part of the study is sectioned as ensues: Sect. 2 details related work.
Section 3 gives the result and discussion. The study concludes in Sect. 4.
Several works have been carried out by researchers in detecting android malware
detection. This section discusses various approaches that had been used in literature.
Android Malware Detection: A Survey 257
write to the external storage. There is low false positive rate in permission based
method as compared to signature based method [12].
Yerima et al. [24] employed a technique that uses ensemble learning for detection
of Android malware. This approach prompt zero-day malware detection as there is no
need for feature selection hence provides robustness and resilence to code obfuscation.
The results showed detection accuracy of 97%.
Yuan et al. [25] proposed an approach using Deep Learning to identify malware in
Android phone. Deep learning is a new aspect of machine learning research whose
application in artificial intelligence is increasing tremendously and it has also inspired a
large number of victorious applications in speech and image recognition. Datasets
greater than 200 features were extracted from both static analysis and dynamic analysis
of each Android applications, and deep learning mechanism is used to group the
illegitimate apps from normal apps. The model achieved an accuracy of 96.60%
indicating that deep learning method is far more proper than some other machine
learning methods. However, it was not deployed for the online android malware
detection system. Authors in [26] proposed a deep learning technique for malware
process detection involving two stage Deep Neural Networks. Features are extracted
using trained Recurrent Neural Network and classification of feature images are
achieved by using convolutional neural network. Although, best result was achieved
with the techniques but it was not fully utilized due to small dataset usage.
Hasegawa and Iyatomi [27] proposed a light-weight Android malware detection
method in which a small portion of the Android application package file of the target is
analyzed using one-dimensional convolutional neural work. Results showed that ille-
gitimate applications can be identified with an accuracy of 97%. The model captured
some of the major features but they are not confirmed since APK is a compressed file.
Authors in [28] proposed an approach using Artificial Immunity based on detector set
artificial immune system (MAIS) for the discovery of illegitimate applications in
mobile computing devices based on the data flows in Android apps. Results showed
93.33% accuracy with reduced false negative rates (FNR). However, the false negative
rate remained high. In [29]; Hidden Markov Model (HMM) with the combination of
structural entropy is used for the detection of Android malware. The Statistical pattern
analysis algorithm that uses a length of the observed sequence, the number of definite
observation symbols, observation sequence, state transition probability matrix, and
initial state distribution matrix is employed for the detection of the malicious codes.
The result showed that the accuracy of malware families’ recognition is very high but
can be improved by using a larger number of datasets. Li and Jin [30] presented an
approach to identify Android Malware Detection based on Feature Codes. As the
feature vector, both Function call and system call are evaluated and extracted from the
malicious sample library. This will be exposed to training and classification upon
machine learning and data mining algorithm, hence a feature library and a detection
model is created. By using the feature vectors of codes from the Android applications,
the system developed proved that it can efficiently uncover the unknown malicious
Applications of android with high accuracy and low false positive rate.
Traditional machine learning method such as back propagation neural network is
very shallow and hence could only train with small dataset which reduced the accuracy
of detection. The conventional method often called deep learning utilized several layers
of neural network is able to train with large dataset which improved the accuracy of
detection.
260 M. Odusami et al.
Table 1. (continued)
Approach Author Description of Strengths Weakness
method
Machine [21] Hybrid Capture Slightly lower True
learning instantaneous attacks Positive Rate (TPR)
[22] Hybrid It consumes low Performance is
resources reduced due to
communication
dependent on the
server
[23] Back propagation Achieves 0.982773 The number of
Neural network F-score the states of Markov
chains quite large
[24] Ensemble learning Detection with low Large feature space
false positive is required
[25] Deep learning Detects illegitimate Unrealistic dynamic
(Deep Belief applications with analysis
neural network) 96.7% accuracy of malicious may
evade the detection
system.
[26] Recurrent Neural
Network and
Convolutional
Neural Network
[27] Deep learning A malicious Features captured
(Convolutional application can be were not confirmed.
Neural Network) identified with an
accuracy of 96–97%
[28] Artificial High False negative Inadequate dataset
Immunity rate for experimentation
[29] Hidden Markov Identify malicious The verification of
Model applications with the technique is done
a precision of 0.96 on a large
dataset
[30] Feature vector Detect the unknown Insufficient behavior
malware effectively features
This section gives various results that were gathered from the survey with respect to
approaches used to detect android malware. From Table 1, the approaches can broadly
be classified into static analysis approach, Dynamic analysis approach, and Machine
learning. The detection accuracy for each of the review papers is analyzed as shown in
Figs. 2, 3 and 4.
262 M. Odusami et al.
All of the three categories of Android Malware techniques have shown promising
techniques as shown in Figs. 1, 2 and 3 above. The accuracy of machine learning
techniques is of close merging to each other and they are closer to 100% accuracy than
other techniques. From Fig. 3, the study that shows highest accuracy used a hybrid
method of machine learning. Deep learning method also shows a higher accuracy
because it highly learned other relevant features for training. The accuracy of most of
the methods is affected by the total number of Data sets used. Table 2 indicates the
number of datasets used in each of the review papers and their sources.
In Table 2 authors in [24] employed the largest dataset. The malware dataset was
obtained from AMD dataset (24,553 malware) and Android Drebin project dataset
(5,560). Normal application dataset was obtained from Appsap and pk pure.the dataset
was divided into two in which 5000 were taken from both malware dataset, and 2000
Android Malware Detection: A Survey 263
Table 2. Total number of Datasets used in each of the review papers and their sources
Approach Author Evaluation dataset Source
Malware Bening
Static [7] 4,554 51,179 Virus share, contagio mobile, malware.lu,
analysis Google play store
[8] 3309 15993 The genome project, Google play store
[9] 4006 100 security service provider, Google play store
[10] 3723 500 The genome project, droid analytics,
contagio minidup
[12] 238 1500 Contagio, Google play store
[13] 6909 1853 Drebin, Genome project, Google play store
[14] 130 235 the designated website, Google play store
[15] 1260 1227 slideme, panndaapp, Google play store
Dynamic [16] 2850 2850 Genome, Drebin,
analysis [18] 705 10000 Genome, chjneze app store, virus total
detected app, Google app store, Amazon
app store and Samsung app store.Baidu app
store,
[5] Not Not The data set was created
specified specified
[18] 2,784 NA Genome, Contagio mobile, virus share
[19] NA NA NA
(continued)
264 M. Odusami et al.
Table 2. (continued)
Approach Author Evaluation dataset Source
Malware Bening
Machine [20] 100 100 NA
learning [21] NA NA Drebin
[22] 1227 1189 Google play store
[23] 2925 3938 Antivirus vendor
[24] 1760 20000 Contagio app set, genome project, Google
play store
[26] 30113 2000 AMD dataset and Android Drebin Project
dataset. Appsap, Apkpure
[27] 28 30 The genome project, Google play store
[28] 6192 5560 Drebin
[29] 350 750 Contagio
for the normal applications. Authors in [24] have proven to be effective with the use of
more dataset than other methods. Although in the study [20], the total number of the
dataset was not specified, the result showed very high detection accuracy with the use
of Drebin dataset. Drebin dataset helps classifier to provide high detection rate because
malicious applications and normal applications are scattered in a manner that it
bypasses the overfitting of the classifier [20]
In this study, a survey was conducted by considering existing techniques and their
accuracy based on the dataset used in the literature. The study indicates a promising
approach to Android malware detection. Machine learning showed a better approach
with higher detection accuracy than other approaches most especially the hybrid based.
Therefore researchers need to look into the development of an improved mechanism in
the area of machine learning by exploring more of the deep learning techniques in the
detection of Android malware and training the model with large datasets for fully
utilization of the model.
References
1. Akinboro, S.A., Omotosho, A., Odusami, M.O.: An improved model for securing ambient
home network against spoofing attack. Int. J. Comput. Netw. Inf. Secur. 10(2), 20 (2018)
2. Onashoga, A.S., Abayomi-Alli, O.O., Sodiya, A.S., Ojo, D.A.: An adaptive and collaborative
server-Side SMS spam filtering scheme using AIS. Inf. Secur. J. Glob. Perspect. 24(4–6),
133–145 (2015)
Android Malware Detection: A Survey 265
3. Singh, R.: An overview of android operating system and its security. Int. J. Eng. Res. Appl. 4
(2), 519–521 (2014)
4. Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K., Siemens, C.E.R.T.:
DREBIN: effective and explainable detection of android malware in your pocket. In: NDSS,
vol. 14, pp. 23–26, Febrauary 2014
5. Gandhewar, N., Sheikh, R.: Google Android: An emerging software platform for mobile
devices. Int. J. Comput. Sci. Eng. 1(1), 12–17 (2010)
6. Chaba, S., Kumar, R., Pant, R., Dave, M.: Malware Detection Approach for Android
systems Using System Call Logs (2017). arXiv preprint arXiv:1709.08805
7. Ye, Y., Li, T., Adjeroh, D., Iyengar, S.S.: A survey on malware detection using data mining
techniques. ACM Comput. Surv. (CSUR) 50(3), 41 (2017)
8. Kang, H., Jang, J.W., Mohaisen, A., Kim, H.K.: Detecting and classifying android malware
using static analysis along with creator information. Int. J. Distrib. Sens. Netw. 11(6),
479174 (2015)
9. Faruki, P., Laxmi, V., Bharmal, A., Gaur, M.S., Ganmoor, V.: AndroSimilar: robust
signature for detecting variants of Android malware. J. Inf. Secur. Appl. 22, 66–80 (2015)
10. Song, J., Han, C., Wang, K., Zhao, J., Ranjan, R., Wang, L.: An integrated static detection
and analysis framework for Android. Pervasive Mob. Comput. 32, 15–25 (2016)
11. Sun, M., Li, X., Lui, J.C., Ma, R.T., Liang, Z.: Monet: a user-oriented behavior-based
malware variants detection system for Android. IEEE Trans. Inf. Forensics Secur. 12(5),
1103–1112 (2017)
12. Rovelli, Paolo, Vigfússon, Ýmir: PMDS: permission-based malware detection system. In:
Prakash, Atul, Shyamasundar, Rudrapatna (eds.) ICISS 2014. LNCS, vol. 8880, pp. 338–
357. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13841-1_19
13. Wu, D.J., Mao, C.H., Wei, T.E., Lee, H.M. and Wu, K.P.: DroidMat: android malware
detection through manifest and API calls tracing. In: 2012 Seventh Asia Joint Conference on
Information Security (Asia JCIS), pp. 62–69. IEEE, August 2012
14. Talha, K.A., Alper, D.I., Aydin, C.: APK Auditor: permission-based Android malware
detection system. Digit. Investig. 13, 1–14 (2015)
15. Sato, R., Chiba, D., Goto, S.: Detecting Android malware by analyzing manifest files. Proc.
Asia Pac. Adv. Netw. 36(23–31), 17 (2013)
16. Ping, X., Xiaofeng, W., Wenjia, N., Tianqing, Z., Gang, L.: Android malware detection with
contrasting permission patterns. China Commun. 11(8), 1–14 (2014)
17. Vidal, J.M., Monge, M.A.S., Villalba, L.J.G.: A novel pattern recognition system for
detecting Android malware by analyzing suspicious boot sequences. Knowl. Based Syst.
150, 198–217 (2018)
18. Shankar, V.G., Somani, G.: Anti-Hijack: runtime detection of malware initiated hijacking in
Android. Procedia Comput. Sci. 78, 587–594 (2016)
19. Saracino, A., Sgandurra, D., Dini, G., Martinelli, F.: Madam: Effective and efficient
behavior-based android malware detection and prevention. IEEE Trans. Dependable Secur.
Comput. 15(1), 83–97 (2016)
20. Bläsing, T., Batyuk, L., Schmidt, A.D., Camtepe, S.A., Albayrak, S.: An android application
sandbox system for suspicious software detection. In: 2010 5th International Conference on
Malicious and Unwanted Software (MALWARE), pp. 55–62. IEEE, October 2010
21. Wei, L., Luo, W., Weng, J., Zhong, Y., Zhang, X., Yan, Z.: Machine learning-based
malicious application detection of Android. IEEE Access 5, 25591–25601 (2017)
22. Arshad, S., Shah, M.A., Wahid, A., Mehmood, A., Song, H., Yu, H.: SAMADroid: a novel
3-level hybrid malware detection model for android operating system. IEEE Access 6,
4321–4339 (2018)
266 M. Odusami et al.
23. Xiao, X., Wang, Z., Li, Q., Xia, S., Jiang, Y.: Back-propagation neural network on Markov
chains from system call sequences: a new approach for detecting Android malware with
system call sequences. IET Inf. Secur. 11(1), 8–15 (2016)
24. Yerima, S.Y., Sezer, S., Muttik, I.: High accuracy android malware detection using ensemble
learning. IET Inf. Secur. 9(6), 313–320 (2015)
25. Yuan, Z., Lu, Y., Xue, Y.: Droiddetector: Android malware characterization and detection
using deep learning. Tsinghua Sci. Technol. 21(1), 114–123 (2016)
26. Tobiyama, S., Yamaguchi, Y., Shimada, H., Ikuse, T., Yagi, T.: Malware detection with
deep neural network using process behavior. In: 2016 IEEE 40th Annual Computer Software
and Applications Conference (COMPSAC), vol. 2, pp. 577–582, June 2016
27. Hasegawa, C., Iyatomi, H.: One-dimensional convolutional neural networks for Android
malware detection. In: 2018 IEEE 14th International Colloquium on Signal Processing & Its
Applications (CSPA), pp. 99–102. IEEE, March 2018
28. Brown, J., Anwar, M., Dozier, G.: Detection of mobile malware: an artificial immunity
approach. In: 2016 IEEE Security and Privacy Workshops (SPW), pp. 74–80. IEEE, May
2016
29. Canfora, G., Mercaldo, F., Visaggio, C.A.: An HMM and structural entropy based detector
for android malware: an empirical study. Comput. Secur. 61, 1–18 (2016)
30. Li, Y., Jin, Z.: An Android malware detection method based on feature codes. In: 4th
International Conference on Mechatronics, Materials, Chemistry and Computer Engineering
(2015)