Article-Survey in APT
Article-Survey in APT
Abstract—Threats that have been primarily targeting nation There were days when an attacker or a group of attackers’ goal
states and their associated entities have expanded the target was to bring down an organization for financial gain or even to
zone to include the private and corporate sectors. This class prove themselves by damaging the reputation of the company.
of threats, well known as Advanced Persistent Threats (APT),
are those that every nation and well-established organization In all those attacks, the attackers were not trying to hide their
fears and wants to protect itself against. While nation-sponsored actions. There are still these types of attacks, however, there
APT attacks will always be marked by their sophistication, APT is a different breed of attacks that has become increasingly
attacks that have become prominent in corporate sectors do not prominent over the last couple of decades, and this different
make it any less challenging for the organizations. The rate at class of attacks is what this paper is all about. This class of
which the attack tools and techniques are evolving is making
any existing security measures inadequate. As defenders strive attacks is characterized by slow and low movement of a group
to secure every endpoint and every link within their networks, of attackers to accomplish their goal, which is usually stealing
attackers are finding new ways to penetrate into their target the target’s data without getting caught. The term given to this
systems. With each day bringing new forms of malware, having class of attacks is Advanced Persistent Threats (APT). APT
new signatures and behavior that is close to normal, a single attackers might use familiar methods to break into their target
threat detection system would not suffice. While it requires
time and patience to perform APT, solutions that adapt to the entity’s network, but the tools they utilize to penetrate are not
changing behavior of APT attacker(s) are required. Several works familiar. As the term specifies, the tools used are advanced,
have been published on detecting an APT attack at one or two and they need to be so for an attacker to be persistent in the
of its stages, but very limited research exists in detecting APT network for longer periods. They keep themselves low, slowly
as a whole from reconnaissance to cleanup, as such a solution expanding their foothold from one system to another within
demands complex correlation and fine-grained behavior analysis
of users and systems within and across networks. Through the organization’s network, gaining useful information as they
this survey paper, we intend to bring all those methods and move and export it to their command and control center in a
techniques that could be used to detect different stages of APT strategic fashion. APTs are usually performed by well-funded
attacks, learning methods that need to be applied and where to attackers provided with the resources they need to perform
make your threat detection framework smart and undecipherable the attack for as long as the funding organization needs. The
for those adapting APT attackers. We also present different
case studies of APT attacks, different monitoring methods, and attack only ends when it is detected or when the funding
mitigation methods to be employed for fine-grained control of organization gets all the data it needs. Either way, considerable
security of a networked system. We conclude our paper with damage would have been done to the organization that was
different challenges in defending against APT and opportunities the victim of an APT attack, sometimes irreparable damage,
for further research, ending with a note on what we learned which is most common in the latter case where the attack was
during our writing of this paper.
not detected until all the organization’s data have fallen into
Index Terms—Advanced Persistent Threat, APT, Targeted the wrong hands. Victim organizations of APT attacks often
Attacks, Intrusion Detection end up being questioned on their failure to detect the attack
even after having security measures such as strong intrusion
I. I NTRODUCTION detection and prevention systems. The answer to this question
is what we provide in this paper.
HANKS to the strong emphasis on information security
T on the part of security researchers across the world, se-
curity that once was exclusive to military and well-established
The goal of an APT attack is not only to gather a target
entity’s data, but also to stay undetected until the attack has
been lifted. For this, the well-funded attackers work on creat-
organizations has now started to become part of every orga-
ing sophisticated tools such as new types of malware that are
nization. However, this does not suffice as each day we are
not usually detected by signature-based anti-virus software or
introduced to a new type of malware, and a new form of attack.
intrusion detection and prevention systems. They gather every
* These two authors have equally contributed to this work. All authors detail about the organization, such as the tools and techniques
are with the School of Computing, Informatics, and Decision Systems the organization uses, the applications it hosts, the Anti-Virus
Engineering, Arizona State University, Tempe, AZ 85281 USA. Adel Al- software, Intrusion Detection System (IDS), and Intrusion
shamrani also with University of Jeddah, Jeddah, Saudi Arabia (e-mail: asal-
shamrani@uj.edu.sa; sowmya.myneni@asu.edu; ankur.chowdhary@asu.edu; Prevention System (IPS) it uses. Further, they spend time in
dijiang.huang@asu.edu) identifying the vulnerabilities in all these tools and creating
Digital Object Identifier: 10.1109/COMST.2019.2891891
1553-877X c 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
2
malwares that would exploit those vulnerabilities. They then context that refers to attacks carried out by nation-states. APT
send out these created malwares, often via phishing/spear- is defined by the combination of three words, [2], which are:
phishing attempts, to gain entry into the organization’s net- Advanced: APT attackers are usually well-funded with
work. access to advanced tools and methods required to perform
In an article published in 2013 [1], Mandiant, an American an APT attack. These advanced methods include the use of
cybersecurity firm reported several key findings of APT attacks multiple attack vectors to launch as well as to keep the attack
performed by one of the largest APT organizations on a broad going.
range of victims for long periods, starting from about 2006, Persistent: APT attackers are highly determined and per-
by maintaining an extensive infrastructure of computers across sistent and they do not give up. Once they get into the system,
the world. In its M-Trends 2017 report, FireEye points out the they try to stay in the system for as long as they can. They plan
increase in the level of sophistication of financial attackers for the use of several evasive techniques to elude detection by
that is no longer any lower than advanced state-sponsored their target’s intrusion detection systems. They follow "low
attacks. In support of this, FireEye presented evidence that and slow" approach to increase the rate of their success.
shows how the attackers evaded detection by IDS/IPS with the Threat: The threat in APT attacks is usually sensitive data
use of backdoors that were loaded even before the operating loss or impediment of critical components or mission. These
system was loaded. With every passing year, the number are rising threats to many nation entities and organizations
of APT attacks being reported has been increasing. All this that have advanced protection systems guarding their missions
advancement in the attack methods and tools repeatedly point and/or data.
out the need for deployment of strong defense methodologies According to National Institute of Standards and Technol-
by every organization that wants to protect itself and its data. ogy (NIST) [3], an APT attacker: (i) pursues its objectives
The defense methods should be employed at every phase of repeatedly over an extended period of time; (ii) adapts to de-
an APT attack. fenders’ efforts to resist it; and (iii) is determined to maintain
The goal of this survey is to explicitly study various the level of interaction needed to execute its objectives. These
techniques and solutions that were tailored to APT attacks. objectives are exfiltration of information or undermining or
As such, great emphasis has been placed on a thorough impeding critical aspects of a mission or program through
description of the APT stages, and possible attack methods multiple attack vectors.
and how attack trees can be used in defending against APT To achieve the assigned goal, the attackers have to go
attacks. In addition, through this survey, we intend to point through multiple stages of attacks in different forms while
out challenges and research opportunities in defending against staying undetected. These multiple stages involve establishing
APT. footholds, internal network scanning, and moving laterally
The remainder of this paper is organized as follows: Sec- from one system to another in the network to reach the target
tion II focuses on the definition of APT. Section III dis- system and perform their detrimental activity. Following the
cusses various APT attack case studies. Section IV describes detrimental activity, the attackers might choose to stay to
the individual methods and related papers for APT defense continue their malicious activities on other systems in the
methods. Section V discusses the evaluation methodologies network or leave the system after cleaning up; depending on
of APTs solutions. Sections VI and VII elaborate on current the funding source’s requirements. These multiple stages often
challenges in defending against APT attacks, and possible involve getting into one of the systems within the network and
research opportunities, respectively. Section VIII provides a then performing privilege escalations as necessary to reach the
discussion and comparison of our work with existing surveys. target system, followed by accessing sensitive systems and
Finally, Section IX concludes the paper. sending the status/information over an Internet connection to
the attackers’ command and control center.
II. A DVANCE P ERSISTENT T HREATS Chen et al. in [4], summarized the major differences be-
tween APT attacks and traditional attacks in different aspects
A. What is APT? as shown in Table I:
Advanced Persistent Threat, as the name itself implies, is
not like a regular attack or attack done by a regular hacker. B. What is NOT APT?
APTs are achieved often by a group of advanced attackers that Advanced Persistent Threats are often misunderstood and
are well-funded by an organization or government to gain cru- the term is increasingly being used in industry as an excuse for
cial information about their target organization or government. organizations’ failure to protect themselves from what other
APT is a military term adapted into the information security wise is a targeted attack. On the other hand, lately, as explained
3
in section III, attacks have been recorded with goals that are changed, and a separation of multiple vectors such as physical
not really specified by NIST under APT, but the methods used and digital will only complicate the detection system model,
and the deterministic characteristics of those attacks made the as the dependencies among them is lost. Our simple attack tree
security industry point out the need to revise the definition structure considers the dependencies across multiple vectors of
of Advanced Persistent Threats to include other domains with attack, and we believe our APT attack tree will help defenders
new attack goals. We constructed a list of criteria that would in identifying how far an attacker is from achieving the attack
establish if an attack is APT or a normal targeted breach in goal, and accordingly take a reactive or proactive approach
the given environment. If the answer to any of these below for mitigating the attack. Figures 3, 4, and 5, are special cases
relative criteria is true for the attack case in question, then of our generic attack tree where we have identified the dif-
that attack is not an APT attack, but a targeted attack: ferent attack vectors and their combinations required by APT
This attack could have been prevented in more than attackers in order to achieve their goals. The 3 goals depicted
one way - Given the attack process and the target envi- were defined by NIST in [3] and are Steal Organization Data,
ronment, if the attack is not a surprise and is highly pos- Undermine Organization’s Critical Aspects, and Position for
sible (relative to the target environment), then it should Future. The rectangular nodes represent a collection of one
have been prevented with minimal countermeasures and or more actions, with the topmost rectangular node in each
security controls in place. stage being the goal of that stage, while elliptical nodes are
This attack did not require a great deal of adaptation the actions that the attackers can perform to achieve their goal
by attackers - If attackers’ attempts, to achieve their goal, in each of those stages. With a threat score assigned to each
do not need high adaption or intense evasive techniques of the actions (leaf-nodes) taken by the attackers as found
to the defender’s attempts, then the defense system of the through alerts, followed by correlating those alerts, defenders
target environment needs to be questioned. will be able to estimate the risk and response that is needed
This attack did not exhibit any novelty in its variants to mitigate the threat. The topmost node (root-node) in each
- A novelty in the attack methods or techniques like never of the three attack trees represents the assigned goal of the
before is what often makes an APT attack successful. If attacker for the chosen target.
there is nothing new in the attack methods or techniques, Figure 1 depicts APT attack model given the goal of stealing
then it is supposed to be detected with existing tools and organization data. It is not necessary that these stages are found
techniques. in every APT attack model. In [1], Mandiant has discussed
it’s APT attack life cycle model consisting of 7 stages - Initial
C. APT Attack Model: How APT attacks are made? Compromise (1), Establish Foothold (2), Escalate Privileges
APT attacks, as mentioned earlier, are well planned and (3), Internal Reconnaissance (4), Move Laterally (5), Maintain
highly organized towards increasing the probability of the Presence (6) and Complete Mission (7) with stages 3 through
attack’s success. To be successful, they perform attacks in 6 happening in any order. In [7], Ussath et. al. have discussed
multiple stages. To answer the how part of an APT attack, a 3 stage APT attack life cycle model focusing only on the
we take the help of APT Attack Tree depicted in Figure 2. representative characteristics of an APT attack. The 3 stages
In [5], Schneier describes attack trees and their effectiveness discussed by the authors are Initial Compromise (1), Lateral
in evaluating the security of a system. With the goal of the Movement (2) and, Command & Control Activity (3). Other
attack as the root node, a properly constructed attack tree, modified versions of the APT attack life cycle model have
though not limited to this function, can give information on been proposed in literature. While all these attack models are
the assumptions of the security of the system and the attacks similar in terms of the operations involved in APT attacks, they
that are likely to happen. These trees can help defenders to are either too generalized or too specific. Addressing this, we
place necessary security measures to detect/prevent different have categorized APT attacks into 5 stages that could represent
components of the attack by automating the correlation of every APT attack irrespective of the goal while showing how
events reported within the system and the probability of those the goal can change the stages involved as below:
being part of an attack in progress. Giura and Wang in [6] Stage 1: Reconnaissance - Reconnaissance marks the
have presented an attack tree for APT attacks, and unlike beginning of any successful attack. The more attackers
theirs, our attack tree is generic and can be applied to different understand about the target, the higher their rate of
goals of the threat actors. By having the sub-trees represent success.
different planes, the authors pointed out the correlation needed Stage 2: Establish Foothold - This stage represents the
between those planes. However, the attack methodologies have attackers’ successful entry into their target’s computer
4
and/or computer network. In order to achieve their goal etc. help the attackers not only to establish a foothold, but
they need to establish a foothold in the target’s network. also to penetrate deeper into the target’s network. Gathering
Stage 3: Lateral Movement/Stay Undetected - If the information, as shown in our APT attack tree, usually involves
attackers’ goal is to undermine critical components or social engineering techniques, reconnaissance performed on
to steal organizational data, they would need to laterally site, port scanning, and service scanning, which refers to
move within the target’s network in search of those psychological manipulation of people into accomplishing
components or data. goals that may or may not be in the target’s best interest
Stage 4: Exfiltration/Impediment - When the attackers’ [4]. In addition, APT campaigns query publicly available
goal is to get organizational data, actions comprising repositories, using "who is" [8], and Border Gateway
retrieving and sending this data to the attackers’ command Protocol (BGP) looking for domain and routing information,
and control center fall under this stage. In addition, when finding websites on the targeted network that have high-risk
the attackers’ goal is to undermine critical components, vulnerabilities, such as cross-site scripting (XSS) and SQL
actions comprising disabling or destroying the critical injections (SQLI), and fingerprinting organizational networks
components of that target organization will fall under this to check for opened ports, address ranges, network addresses,
stage. active machines, firewalls, IDS/IPS, running software, access
Stage 5: Post-Exfiltration/Post-Impediment - This points, virtual hosts, outdated systems, virtualized platforms,
stage involves post-exfiltration/post-impediment activities storage infrastructure, and so on, to decipher the network’s
such as continuing to exfiltrate or disable more critical layout [9]. In APT attacks, reconnaissance usually is passive,
components or delete evidence for a clean exit from the as attackers do not exploit a victim, but instead are collecting
organization’s network. data in preparation for the attack. Once APT actors have
For any of the 3 APT goals, the first 2 stages are necessary collected enough information, they construct an attacking
for the attackers to go through, in order to increase their plan and prepare the necessary tools.
probability of success. These stages as explained later in this
section are Reconnaissance, and Establishing Foothold. The Stage 2: Collected information from the previous stage,
other 3 stages are applicable based on the attackers’ goal. If as shown in our APT attack tree, can be used to exploit
the goal of the attackers is to steal the organization’s data vulnerabilities found in the target organization’s web
or undermine critical aspects of the organization, the attack- applications or to exploit vulnerabilities in end user systems
ers would have to move laterally within the organization’s via malware execution. Below we explain the different
network in search of data resources or critical components methods and techniques that APT attackers use in this stage:
respectively and to gather information that will help them in A) Exploitation of Known Application Vulnerabilities:
progressing their attack. Differences in these 2 goals can be Exploitation of known vulnerabilities is another source
seen during the stage 4 and 5. While attackers with goal to that APT attackers utilize to perform APT attacks. Known
steal the organization data involve in data exfiltration activities, vulnerabilities are usually exposed and can be obtained
attackers with goal to imped critical components involve in from well-known vulnerability databases such as Common
bringing down critical aspects of the organization. On the other Vulnerabilities and Exposures List (CVE), Open Source
hand, attackers with goal to position for future take a different Vulnerability Database (OSVDB) [10], and NIST National
path in stage 3 where they keep themselves updated with the Vulnerability Database (NVD) [11] which publicly disclosed
changes happening within the organization’s network, studying vulnerabilities where each vulnerability is identified using an
and understanding the working of the system and the users, unique CVE-ID. In addition, in some cases attackers can share
thus gaining as much information as they can while staying and collect useful information about found vulnerabilities
unnoticed. APT attackers with goal to position for future do in dark-web and deep-web forums [12]. According to the
not involve in stage 4 and 5 unless their goal is changed reported study in [7], majority of APT attacks were based
to either to steal organization data or to undermine critical on known exploits. Therefore, it is essentially important to
aspects. In the later parts of this section we explain in detail apply security patches shortly after vulnerabilities have been
each of these stages and the multiple vectors that attackers can released.
use in each of those stages. B) Malware: According to Symantec’s 2017 Internet
Security Threat Report there were 357 million malware
Stage 1: One of the first steps attackers take is to learn variants in the year 2016. The increasing rate of receiving
about their target. The more they understand the target, malware over emails has significantly increased from 1 in
the more successful they may be with their attack. As part 220 emails in 2015 to 1 in 131 emails in 2016. Symantec
of this phase, attackers extensively research about their attributes this increase in malware to the botnets that deliver
target, gathering necessary information and intelligence of the spam campaigns. As shown in our APT Attack Trees
the organizations’ assets towards increasing their rate of Figures 3, 4, and 5, malware can be sent via spear-phishing,
success. This information includes but not limited to the USB devices, and/or web downloads.
details of the employees such as their social life, habits, and C) Spear-Phishing: In the same threat report, Symantec
websites they often visit. Further, details of the underlying also reported that targeted spear-phishing campaigns
IT infrastructure, such as the types of switches, routers, specifically in the form of business email compromise scams,
anti-virus tools, firewalls, web servers used, ports open, are being favored by attackers instead of the old mass-
5
Achieve
Assigned
Objective
Stage 4 Stage 5
Perform Post
Detrimental Detrimental
Activity Activity
Execute Continue
Gear up for Remove
Detrimental Detrimental
action Activity Traces
Activity Activities
Stage 2 Stage 3
Learn about the
Establish
system/critical
Foothold
nodes
Stage 1
Stay Updated
Exploit Communicate
Reconnaissance with Move Laterally
Vulnerabilities with C&C
Environment
Symbol
mailing phishing campaigns. This starts with the attackers domain, sender domain, message type, and number of links
performing social engineering or other such techniques to and characteristics of URLs in links. The J48 classifier was
gain information about the organization and then sending evaluated for a combined dataset of 4559 phishing emails and
out emails with malware in it. These fraudulent emails 4559 legitimate emails using 10-fold cross validation. They
are cleverly crafted, well enough to entice the targeted achieve 98.11% accuracy and 0.53% false positive rate.
recipients to open the attachments. Employees unaware of the D) Zero-day vulnerability: A zero-day vulnerability is a
malware might risk the organization’s network by opening software bug that either the software manufacturer is unaware
the attachment or link that leads to installation and execution of, or is aware of but was not able to fix before the attackers
of malware. This malware when executed might exploit either could utilize it. APT attackers gather information about the
known or unknown vulnerabilities to establish a foothold in organization’s system components, such as the operating
the organizational network. Figure 6 shows an illustration of system versions, patches ran, and software components
spear phishing example. installed on those systems, including anti-virus and anti-
In [1], an APT actor sent a spear-phishing email to a malware components. They then go about identifying any
Mandiant’s employee in which the email seemed to be vulnerabilities in those versions that could be utilized to gain
sent by the Mandiant’s CEO. The APT actor created a entry into the target’s network. However, according to the
specific email’s account using real-name (Mandiant’s CEO reported study in [7], only a few APT attacks were performed
name). The email contained a malicious ZIP file for the and achieved through zero-day vulnerability, and the majority
goal of installing an executable backdoor "WEBC2-TABLE". of APT attacks were based on known exploits.
Authors in [13] proposed a machine learning model that Lee and Lewis in [14] have focused on APT achieved
used J48 decision tree classifier to detect phishing emails. through malware sent via emails. The authors have examined
Their model was trained on 23 features generated from an several emails with binaries and have come up with a solution
email’s header and body. These features contain message ID involving constructing an undirected graph where nodes of
6
Steal
Organization
Data
Stage 4 Stage 5
Post
Exfiltration
Exfiltration
Continue
Find Data Cover Up
Exfiltration
Establish Lateral
Foothold Movement
Stage 1
Initial Penetrate
Initial Communicate
Reconnaissance Communication Deeper into
Exploitation with C&C
with C&C Network
Access Other
Reconnaissance External Policy Changes
Nodes
at Site Network Scan
Devices
SpearPhishing
Eg: USB sticks
Spread Privilege
Malware Escalations
Web Download
Symbol
Fig. 3: APT Attack Tree - Case When Attackers Goal is to Steal Organization Data
the graph represent the email addresses and edges correspond hole attack.
to the exchange of email messages that connect the nodes F) Watering-Hole Attack: Unlike phishing attacks that
with an aim that this graph would give them further helpful involve luring employees to malicious websites, watering-
information in analyzing the targeted malware. However, the hole attacks involve infecting one of the websites that the
problems with this solution is that there could be several target organization’s employees frequently visit. As depicted
attack nodes without any links to other attacks which could in Figure 7, the attackers use the targeted employees’
mean that there was not enough visibility of the recipients information gathered in Stage 1, and find vulnerabilities in
of the attacks or those could be unique attacks that warrant the websites visited by them towards injecting malicious
further investigation before concluding with the existence of code into one or more of the vulnerable websites. Once the
an APT attack. targeted employee or group of employees, visit(s) the infected
E) Web Download: As mentioned earlier, spear-phishing website, the malicious code is downloaded onto the system
emails could have malicious files attached to them that need to giving the attackers’ access to the system.
be opened, or they could contain links to malicious websites After sending out emails with malicious attachments or
that when employees visit, they unknowingly download links to websites with malicious software, attackers patiently
malware. Alternatively, attackers can inject malicious code wait for the malware to run within the organization’s network
into one of the websites frequently visited by the targeted that would open the gates to the organization’s system. The
employees. This latter attack technique is called the watering- challenge for an APT attacker here is to have the malware run
7
Undermine
Organization
Critical Aspects
Stage 4 Stage 5
Establish Lateral
Foothold Movement
Stage 1
Initial Penetrate
Initial Communicate
Reconnaissance Communication Deeper into
Exploitation with C&C
with C&C Network
Symbol
Fig. 4: APT Attack Tree - Case When Attackers Goal is to Undermine Organization’s Critical Components
without being detected by the anti-virus tools, and intrusion This includes putting malware and other tools on different
detection and prevention systems. Once the attackers get machines inside the compromised system components and
control of the system through the malware execution that hiding them. Some times, this phase involves privilege
exploits vulnerabilities in the system, they keep low to go escalation, and at other times it involves getting passwords
undetected to the next phase. At this point, APT attackers of the users through key loggers. Other times, it could
aim to build a Command and Control (C&C) communication be through pass-the-hash techniques and/or vulnerabilities
channel after infiltrating the targeted network to deploy exploitation. The chosen method depends on the environment
subsequent attacks. Most malware makes use of Domain of the target system. The goal of the attackers in this phase
Name Systems (DNS) to locate their domain name servers and is to expand their foothold to other systems in search of
compromised devices, so APT attackers can establish a long- the data that they want to ex-filtrate. Therefore, once the
term connection to victims’ devices for stealing sensitive data. attacker has reached this advanced stage, it is very difficult
to completely push out such attacker out of the environment
Stage 3: Now, once the attacker has gained an access to [7]. Table II shows some techniques and methods used to
the targeted system, he/she can spread over to other systems accomplish lateral movement. Hash and password dumping
within the target’s internal environment. The attacker uses (credential dumping) is the process of obtaining account
various techniques to access other hosts from a compromised login and password information from the operating system
system and get access to sensitive resources. Most often, and software. Credentials can be used to perform lateral
stolen legitimate credentials are used during this stage. movement and access restricted information. APT attackers
8
Position For
Future
Stage 2 Stage 3
Stay Updated
Establish
with
Foothold
environment
Stage 1
Initial Initial
Reconnaissance Monitor for Communicate
Exploitation Communication
Changes with C&C
with C&C
Understand
Defense
Send and Methods
Social Scan Exploit Web Exploit via
Engineering Applications Application Malware
Vulnerability
Reconnaissance External
at Site Network Scan
Devices
SpearPhishing
Eg: USB sticks
Web Download
Symbol
Fig. 5: APT Attack Tree - Case When Attackers Goal is to Position For Future
able to dump this information from a system. Currently, threats, which later started to include different non-nation and
mimikatz is the most widely used hash and password dumping non-governmental organizations.
tool because it is able to dump clear text passwords and
it also offers further features. Windows Credential Editor A. Titan Rain
(WCE) is another tool that is used by APT attackers to gather
valid credentials. Although there are different techniques for In 2003, a series of coordinated cyber attacks, later code-
dumping Windows credentials, the most common method is named T itanRain, have emerged that infiltrated several com-
to extract and analyze parts of the Windows Local Security puters and networks associated with U.S. Defense Contractors
Authority (LSA) process [7]. These tools are in use by with a goal to steal sensitive data. These were found to
both professional pen-testers and adversaries. In APT1 [1], continue until the end of 2015, stealing unclassified informa-
during the lateral movement stage, three methods were used: tion from their targets, though no reports of stolen classified
Installation of new backdoors on multiple systems, usage of information were made. The level of deception involved and
legitimate VPN credentials, and signing into web portals. the use of multiple attack vectors marked these attacks as the
first of their kind.
Stage 4: In case of stealing organization data, the attackers
export the data they collected to their command & control B. Hydraq
server. Since most of the intrusion detection and prevention One of the first APT attacks on commercial companies that
systems do ingress filtering and not outgress filtering, their has drawn great attention was Hydraq, name used in referring
data exfiltration could go undetected. Depending upon the to the Trojan that establishes the backdoor, well known under
organization’s defense methodologies, the attackers’ might the original name given to this attack, ’Operation Aurora’. This
intelligently split the data exfiltration into batches and to coordinated attack involved the use of several malware compo-
servers with different IP addresses. Ullah et al. in [26] nents that are encrypted in multiple layers to stay undetected
summarize the latest data exfiltration incidents in 2017 as for as long as they can. The attack found to be launched in
shown in Table III. 2009 has targeted different organization sectors, Google being
one of them and the first one to announce it, followed by
Stage 5: The goal of an APT attack is not just performing Adobe. The name ’Aurora’ came from the references in the
detrimental activity, but to keep doing so until the attack has malware that got injected during the malware’s compilation on
been lifted by the attack sponsor. The sponsor could choose the attackers machine. The malware was found to use a zero-
to lift the attack once the data retrieved is found to be what day exploit in Internet Explorer (CVE-2010-0249 and MS10-
is wanted or could keep the attack still active to keep getting 002) [28] to establish foothold on the system. When users
data as long as the attackers can. In either case, the attackers visited the malicious site, Internet Explorer was exploited to
would have to cover their tracks so that they leave no clue download several malware components. One of the malware
of themselves or the sponsoring entity. If there is no need for components established a backdoor to the machine, allowing
further exfiltration or impediment, any tools installed during attackers to get onto the organization’s network as and when
the attack, or logs that could give strong evidence are removed needed. In some of the earlier cases, the malware exploited a
as part of this stage. vulnerability in Adobe reader and acrobat applications (CVE-
2009-1862) to establish foothold on few companies. Unlike
D. Command and Control (C&C) Communication the earlier instances, the later instances of these malware
As stated earlier, APT attackers need to have an open were found to no longer use the zero-day vulnerabilities.
communication channel between their servers and victims’ Nevertheless, the attacks continued for several months after,
machines. This is known as (C&C) or (C2) which is an in different countries across the globe under different variants
essential component during the lifetime of APT attacks. The of the Trojan Hydraq. The common aspect of the trojan is,
C&C communication applies mainstream network services the malware gathers system and network information initially,
such Hyper Text Transport Protocol (HTTP), HTTP Secure followed by collecting usernames and password into a file
(HTTPS), Internet Relay Chat (IRC), Peer-to-Peer (P2P), that is later sent to its command and control center whose
custom protocols, and others. HTTP-based connections are IP address or domain name is hard-coded within the malware.
preferable over others due to the fact that, first, HTTP-based
C&C traffic are labeled as legal in most enterprise, second, C. Stuxnet
other C&C protocols such as P2P and IRC traffic has distinct In 2009, a sophisticated worm that spreads itself to other
network features such as ports, and package content, which components in the entity with a goal to impede Iran’s uranium
are easily identifiable and can be blocked [7], [27]. nuclear project, had been launched. At first, this malware was
found to exploit a zero-day vulnerability found in LNK file of
III. APT ATTACKS C ASE S TUDY Windows explorer. Microsoft named this malware as Stuxnet
APT attacks that have become prominent over the past from a combination of file names found in the malicious code
decade actually have been reported even before the term APT (.stub and MrxNet.sys) after being reported about this zero-
was coined in the late 2000s. However, in those early times, day vulnerability. However, it was later found out that in
nation entities were the targets of such advanced and persistent addition to the LNK vulnerability, a vulnerability in printer
10
spooler of Windows computers was used to spread across digital code could create in physical world. It was not just all
machines that shared a printer. And then this malware used 2 about 4 zero-day vulnerabilities, 2 stolen certificates, and 2
vulnerabilities in Windows keyboard file and Task Scheduler command and control centers, it was more than that, a cleverly
file to gain full control of the machine by performing privilege crafted, layered piece of malware that could be tweaked by the
escalation. In addition, it used a hard coded password within a attackers through the command and control centers using over
Siemens Step7 software to infect database servers with Step7 400 items in its configuration file. The end date of Stuxnet was
and from there infect other machines connected to it. After found to be in 2012, 3 years after it was unleashed. Though
the malware first enters a system, it sends the internal IP Iran found out the existence of this 500 KB malware in its
and the public IP of that system along with the computer Natanz plant in 2010, amidst all the havoc of Stuxnet, some
name, operating system of the system, and whether Siemens of its centrifuges were already damaged, slowing down its
Step7 software was installed on that machine, to one of its 2 nuclear weapon generation process.
command and control centers running in 2 different countries.
Through these command and controllers the attackers either
let the malware infect the system or updated the malware with D. RSA SecureID Attack
new functionality. It was soon found out that Stuxnet was way In 2011, RSA, a secure division of EMC Software an-
beyond control with several computers in different countries nounced a sophisticated cyber-attack on its systems that in-
being infected with this malware. Two of the zero-days used volved the compromise of information associated with its
in Stuxnet were not new in Stuxnet, they have been exploited SecureID, a 2 factor authentication token product. This is an-
earlier by other small malwares though were not found at that other attack that infiltrated an organization’s network through
time. After security researchers across the globe have dug phishing emails sent to the organization’s employers. As part
into Stuxnet for several months, it was found out that this of this attack, the attackers sent 2 different phishing emails
malware was way beyond what it looked like and it actually to different groups of employers with an excel sheet attached.
sends commands to programmable logical controllers targeted The phishing emails went into the junk folder on the employers
to impede the Iran’s uranium nuclear project. Several reports end, however, they were crafted well enough that an employee
were published by researchers and firms across the world, with opened the attached excel sheet. This excel sheet when opened
more or less conflicting information on the detailed execution exploits the zero-day vulnerability (CVE-2011-0609) of adobe
of Stuxnet as in [29] and [30]. However, they all agree that flash player to install a backdoor. When the employee opened
Stuxnet was found to be like never before, a havoc that a the aforementioned attachment, the backdoor got installed
11
onto the employee’s system. This installed backdoor was APT attacks. Though these APTs have started in nation-state
found to be a variant of a well known remote administration sectors, it did not take much time for the attackers to extend
tool that now the attackers could use to remote access the their scope to non-governmental and commercial sectors with
employee’s machine. With this remote access in place, the goals of stealing corporate data posing the biggest threat to any
attackers started harvesting credentials of several employees company with data as their biggest asset. However, the more
in an effort to reach the target system where they performed recent advanced persistent threats point out that organizations
privilege escalations, stole the data and files, compressed and with assets other than data such as finance organizations where
encrypted them before sending them to their remote command money is the major asset, are also facing these threats. The
and control center via ftp. RSA detected this exfiltration but Carbanak attack discussed in our case study is one such
not before some of the data got exfiltrated. example.
and Mitigation Methods. Each category or class can be sub- thieves, stealth backdoors etc. The characteristics commonly
categorized into different categories as shown in Figure 8. exhibited by these malware are anomalous information
In the following subsections, each of these classes will be access and processing behavior. For instance, keyloggers and
precisely explained. password thieves intercept the keystroke inputs. In order to
stay undetected, stealth backdoors, as observed by the authors,
either use an uncommon protocol such as ICMP, create a raw
A. Monitoring Methods socket, or intercept the network stack to communicate with
One of the basic and first steps to defending against remote adversaries. ICMP-based stealth backdoors access
APT is to start monitoring the entire network system at ICMP traffic, raw-socket based stealth backdoors access all
multiple points and multiple levels, leaving no entry point the packets with the same protocol number. Example, a TCP
un-monitored. raw socket receives all TCP packets. The stealth backdoors
intercepting the network stack behave like network sniffer
1) Disk Monitoring: Every end system as part of the which eavesdrop on the network traffic to obtain valuable
organization’s network needs to be monitored for any information.
malicious behavior through anti-virus, firewall, or content- In [35], Virvilis and Gritzalis focused on APT attacks
filtering as necessary. Applying patches as necessary to the through malwares such as Stuxnet, Duqu, Flame, and Red
software running on the system will help minimize the entry October. They discussed the issues that enabled the malware
points for an attacker by removing known vulnerabilities that authors to evade detection from a wide range of security
could otherwise spread malware to vulnerable systems within solutions and proposed counter measures for strengthening
the network. In addition, monitoring CPU usage for each of our defenses against similar threats. The paper goes over the
these end systems within the network will help in identifying evading techniques that APT attackers would use, such as,
any suspicious behavior at the end system level. rootkit functionality, endpoint scanning with changed payload,
encryption and obfuscation of network traffic, steganography,
2) Memory Monitoring: One of the ways malware can execution of malware in memory and fake digital certificates.
evade detection is by running within the memory of the end The authors recommend patch management, strong network
system rather than from a file. This so-called fileless malware access controls and monitoring, strict Internet policies,
uses a process that is already running within the memory to protocol aware security solutions, monitoring DNS queries,
execute itself. As there is no separate process running in the monitoring for access to unusual domains, monitoring
background, it leaves no trace except the unexpected memory network connections, honeypots and honeynets, along with
usage by a process that can be identified if monitored. Duqu the standard host-based intrusions prevention systems as
2.0, which infested Kaspersky labs in 2015 [33], ran within countermeasures for APT.
the memory of an already running process, and thus bypassed Korkin and Nesterow [36] presented a novel approach
the verification of the caller process that usually happens on that detects zero-day malware in the memory dump under
their systems. deliberate countermeasures. This proposed method uses
As each day passes by, new and sophisticated malware is modern graphics card or CUDA-enabled GPU hardware to
coming into existence. The authors of the paper [34] have detect malware in memory. Through this paper, the authors
portrayed the characteristics of different types of malware, and discussed the highest stealth malware out there, ways to
proposed a solution ’Panorama’ that will detect these different detect this malware that is in the form of hidden drivers,
types of malware. However, their proposed solution involves and finally they propose an architecture of the software tool
gathering malware and benign samples as training data and that uses CUDA-enabled GPU hardware to speed-up memory
extracting taint graphs from it. They then transform this taint forensics.
graph into a feature vector upon which standard classification In [37], Xu et al, proposed a hardware-assisted malware
algorithms are applied to determine a model. This model is detection solution that uses machine learning to monitor
then used to identify malicious behavior on a system. Their and classify memory access patterns. Unlike one model that
proposed solution is based on the common characteristics of distinguishes all types of malicious activity in 25, this solution
the different types of malware such as key loggers, password is based off one model for each application separating its
13
Logs Monitoring
malware infected executions from legitimate executions. per-application basis, and serves as a buffer that is regularly
Their work is based on the fact that an infected application checked to alarm in case the counter exceeds a given threshold.
run will modify the control-flow/data structures compared to
a benign run. This will be reflected in its memory access
3) Packet Monitoring: The most crucial part of an APT
pattern. They achieve this by having In-processor monitoring
attack is the communication with the Command and Control
of the memory accesses that looks at the virtual addresses
Center (C&C). Communication with C&C happens not
for a more consistent signature. They used epoch markers -
just once, but often multiple times, usually the first time
system calls, function calls, and the complete program run to
the system is compromised and repeatedly later for data
detect the malicious behavior of an infected program. Their
transfer. Monitoring at the end system level for any network
solution covers both user-level and kernel-level threats and
packets with new destination IP addresses, packets with huge
demonstrated very high detection accuracy against kernel
payloads, and large numbers of packets sent to the same IP
level rootkits.
address would help in identifying any suspicious behavior
Vaas and Happa in [38] have proposed a solution to
from within an end system.
identify disguised processes. Their solution involves training
Marchetti and et al. in [39] proposed a framework that
a machine learning algorithm to identify anomalous behavior
can detect, out of thousands of hosts, a few hosts that show
of a machine’s processes on a per-application basis. Their
suspicious activities. They do not claim to identify the hosts
approach is structured into 3 phases: Acquisition phase,
that are surely compromised. They defend that their solution
learning phase and production phase. Their approach
will help analysts to focus on a limited number of hosts
identifies the anomalous behavior of a process of an
rather than thousands of hosts in removing APT from the
application through its virtual memory consumption for
system. This solution of theirs provides a ranked list of top-k
two reasons. Firstly, they believe memory utilization is less
suspicious hosts generated by observing the key phases of
volatile in comparison to network operations or CPU usage
APT across several hosts over time and comparing those
readings. Secondly, they use virtual rather than physical
analysis results of each of the hosts with their past and
memory consumption as the latter does not account for
with other hosts of the observed network. Their solution
the amount of memory swapped to the hard drive. In the
works even for encrypted communications as the payload
first phase, acquisition phase, the memory fingerprint of a
is not inspected. In addition the authors claim that this
target machine using process and system utilities (psutil) is
solution is scalable as most analyses can be executed in
gathered. In the second phase, the learning phase, the machine
parallel thus giving us an efficient solution. The proposed
learning algorithm computes a model for every application
framework in this paper involves flow collection and storage,
based on the fingerprint, a threshold and a threshold factor
feature extraction, feature normalization, computation of
to detect anomalous behavior. If the distance of the current
suspiciousness scores and ranking. Traffic going from internal
print wrt the model print is 0, then the verification can
host to outside is monitored as the framework assumes that
be terminated. If not, the distance is checked to see if it
an APT attacker would do it from internal to external rather
exceeds threshold modified by a multiplicative constant
than from external to internal to evade detection by traditional
factor. If so, they increase a counter that is maintained
intrusion detection systems. And then features are computed
14
for each internal host every time internal T, and then these penetrate systems. While some of them could be known prior
computed features are extracted for further analysis. to the code release, there is always a possibility of unknown
McCuskr et al. [40] focus on the notion of tracking various bugs. The possible vulnerabilities in the source code can
network objects such as hosts, hostgroups, and networks, and be identified by static analysis techniques such as Taint
determining if they are threats. The overall system layered Analysis and Data Flow analysis. In addition, monitoring the
the network flow activities into five layers from network flow code during its execution for its performance and to make
collection to threat analysis. Events and data are collected sure it runs within its scope, neither utilizing unexpected
from a number of different network sensors such as network resources nor using up memory regions that otherwise are
flow, NIDS, honeypots, and then features can be extracted not accessible, would lead to identifying a threat much earlier
and aggregated over multiple periods to creating a sample before it can spread to other systems.
space. They designed three layers, to focus on the use of
discriminative supervised/semi-supervised models to identify 5) Log Monitoring: Logs are an important part of not only
behavior primitives. forensic analysis, but also when used appropriately can help
In [41], Villeneuve and Bennett claim that monitoring in detecting or even preventing attacks in their early stages.
and analyzing network traffic help in detecting APT Correlation of these logs such as memory usage logs, CPU
activities. They analyze different APT campaign such as usage logs, application execution logs, and system logs would
Taidoor, IXESHE, Enfal, and Sykipot which have been used yield a copious amount of information that would make sense
to establish targeted attacks. These malware(s) establish and help in defending systems or network against unknown
communication with a C&C server using known protocols attacks rather than just have the individual logs that often end
such as HTTP and usually configured through three ports 80, up in a pile to be searched afterward for evidence of an attack.
443, 8080. Attackers usually use these ports because they One such paper that correlates data collected from different
know that often only these ports are open at the firewall level. type of logs is [44]. Bohara and others in this paper have
However, attackers may use these ports to pass unmatched proposed an intrusion detection approach that combines the
traffic type such as that sending any non-HTTP traffic on network and host logs to find any malicious activity. From
port 80 or any non-HTTPS traffic on port 443. This can these logs, they extract 4 features, identification, network
trigger alert for further investigation. Monitoring timing and traffic based, service based and authentication based features
size of network traffic is another aspect to consider for APT which are further refiend to reduce redendancies through
detection. This is due to the fact that malware(s) typically the use of Pearson Correlation Coefficient, following which
sent beacon, which is basically communication packets, to those that do not contribute to clustering are removed. The
C&C servers at given intervals. Thus, monitoring consistent resulting data is clustered to identify the maclicious activity.
intervals using DNS requests or URLs will help. Although Their prosposed solution takes the approach of unsupervised
designed malware use HTTP for C&C communication, learning to detect anomalies without any profiling the normal
they usually send requests using Application Programming behavior of the system.
Interfaces (APIs). Therefore, analyzing HTTP headers can Shalaginov et al. in [45] analyzed DNS logs to identify the
help to distinguish API calls from typical browsing activities. communication packets "beacon" activities between infected
Vance in [42] proposed a solution that utlizes flow based internal hosts and external malicious domain names. Basically,
analysis to detect targeted attacks by determining normal they believe that a downloaded malware, as a foothold, will
versus abnormal behavior. Unlike typical network based require opening an external communications channel to the
detection, in flow based analysis, network traffic is aggregated Command and Control (C&C) server. This behaviour will
so the amount of the data to be analyzed is reduced. Traffic leave a record of itself in network flow and DNS logs. Authors
based volume of transferred data, timing or packet size is proposed a methodology for DNS logs analysis and events
analyzed and the result is a high detection rate, low false correlation by considering low latency interval time where
positives. they assumed the infected hosts will communicate the C&C
Fu et al, in [43] have discussed APT attacks coupled with server several time per day. From identifying an infected
insider threats as a 2-layer game characterizing the joint host, they link other hosts that have communicated with same
threats from APT attackers and insiders as a defense/attack suspicious domains. They pre-process the DNS logs to filter
game between the defender and the APT attacker(s) and an unwanted data, and only obtain IPv4 addresses from DNS
information trading game among the insiders. The authors of logs. Then, they start to represent the meta-data in a graph
this paper claim to identify the best response strategies for fashion where the graph’s vertices represent host IP address
each player, and prove the existence of Nash Equilibrium for and domain names, while each edge corresponds to one query
both games. from an internal host to an external machine. The proposed
methodology was evaluated using real DNS logs collected by
4) Code Monitoring: Creating software completely free of Los Alamos National Laboratory published in 2013.
bugs is a mirage. Every software developed, every code that One of the challenges in log monitoring is that there is
you release can never be guaranteed to be error-free. While so much data to look at and analyze to detect an attack. In
making the code itself error-free is quite difficult, making sure [46], the authors proposed an approach to address this problem
that it is error-free when running in different environments by extracting information and knowledge from the dirty logs.
is not possible. These bugs are the means for attackers to The proposed approach involves 3 layers, first layer filters and
15
normalizes the log data using network configuration. Second stage. However, these techniques are static, and thus APT
layer processes this normalized data into different features. attackers will find ways to evade these defense methods.
Third layer performs clustering over these extracted features For instance, they would create new malwares or change the
to determine any suspicious activity. Beehive, the name of this existing malwares to have new signatures for the purpose of
proposed solution, uses logs from different sources such as evading detection by the organization’s anti-malware tool. Fur-
web proxy logs, DHCP server logs, VPN server’s remote con- thermore, they could have these malwares behave as normally
nection logs, authentication attempt logs and anti-virus scan as possible without raising any alarm at their behavior on a
logs. The solution then proceeds with extracting features based system. Here arises the need to employ different categories of
on destination, host, policies and traffic, following which the anomalies such as point, contextual, and collective anomalies,
features are clustered through an adapted k-means clustering and different anomaly detection methods that could detect
algorithm to identify hosts whose behavior significantly differ these close-to-normal behaviors through these anomalies.
from normal. Extensive research has been done in anomaly detection
Bhatt et al. in [47] discussed the kill chain attack model, techniques and methodologies over the past decade. Some of
and proposed a solution that works for a layered architecture, the early works that give a good understanding of anomaly
with outer layer having the least valuable assets and inner layer detection in general have been discussed in this section,
having the most valuable assets. Given this architecture, the followed by works that have used those anomaly detection
attacker is assumed to perform, at least once, all the different techniques to detect different stages of APT attacks.
stages of the attack model in order to get past a layer. Each
layer can be accessed through the processes and applications a) Approaches and Methods:: Hodge and Austin in
running within the immediate outer layer. The solution requires [49] have surveyed different outlier (anomaly) detection ap-
that the probability of finding common vulnerabilities among proaches and methodologies. They classified outlier detection
different layers is very low, so that the possibility of reuse approaches into three types: first, unsupervised clustering,
of the knowledge about vulnerabilities of a layer to attack an approach that processes the data as static distribution,
another layer is minimized. The framework suggested by pinpoints the most remote points, and flags them as potential
the authors detects attacks only with appropriate sensors that outliers. Second, supervised classification that requires pre-
detect different stages of an APT attack at each layer. These labeled data, tagged as normal or abnormal. The third type
sensor would be triggered by rules created with respect to the is semi-supervised recognition or detection that takes pre-
patterns of malicious behavior. Alerts and logs collected by classified data to models only normally or very rarely models
these sensor should be stored and correlated to identify stages abnormal data. As new data arrives, the model is tuned to
and phases of attacks in progress. improve the outlier detection rate by defining a boundary
Niu et al. in [48] proposed an approach to detect APT of normality. Unlike supervised classification, this does not
malware and C&C communication activities through DNS require any training data for abnormality and yet learns
logs analysis. They evaluate their approach using DNS logs to recognize abnormality. The authors further classified the
of mobile devices. Their approach assigns scores to C&C anomaly detection methodologies into statistical anomaly de-
domains and normal domains. Therefore, to distinguish be- tection, neural networks based anomaly detection and machine
tween normal and abnormal (C&C) domains, they select learning based anomaly detection and explained how each of
normal domains according to the number of DNS requests these methodologies handles outliers and made recommenda-
initiated by internal devices and extract fifteen features which tions as to when they are appropriate for previously defined
are categorized under four general categories: DNS request approaches.
and answer-based features, domain-based features, time-based The authors of [50] have provided an overview of the ex-
features, and whois-based features. tensive research done in network intrusion detection systems,
specifically in network anomaly based detection approaches.
They have given a qualitative survey of the different methods,
B. APT Detection Methods
systems, tools, and analysis pertaining to network anomaly
We classify the techniques for detecting APT into the detection. In addition, they have covered a wide variety of
following groups: Anomaly based detection, detection by attacks, focusing on their sources and characteristics while
Pattern Matching. comparing and giving performance metrics for various detec-
tion approaches.
1) Anomaly Detection: One of the key characteristics of an In [51], Chandola et al. gave a broad overview of anomaly
advanced persistent threat is to adapt to the defender’s efforts detection techniques and how they are applicable to different
to resist it. And to defend against such a threat, the defense research and application domains, along with the challenges
methods employed need to learn about and adapt to the of- associated with each of those techniques. The authors dis-
fenders’ attempts. These methods should constitute collecting cussed different aspects of anomaly detection such as the
data from several sources, learning from the collected data, nature of the input data, type of anomaly, available data labels,
and make predictions on the collected data to estimate and and output of anomaly detection. They point out that the nature
respond to the next possible attack. of the attributes of the input data determines the applicability
Table V shows the attack methods and corresponding and of anomaly detection techniques. In addition to the anomaly
existing defense techniques or countermeasures for each APT detection techniques classified and discussed by Hodge and
16
Austin in [49], Chandola et al. discussed two other anomaly for correlations between different source of data (datasets) and
detection techniques—information theoretic anomaly detection how they can be applicable to cloud computing.
techniques and spectral anomaly detection techniques, both of Zhang et al. [55] designed an interactive system to bridge
which can operate in unsupervised settings, with the former the gap of network management and anomaly detection. They
making no assumptions on the underlying statistical distribu- designed a web-based visualization tool for analyzing the
tion for the data, while the latter can automatically perform network and system anomalies within system logs. Their tool
dimensional reduction, making it suitable for handling high- allows different views such as network graph, treemap, area
dimensional data sets. chart, and general view. It also provides search ability based
Later, Chandola et al. in [52] have discussed anomaly de- on different options such as searching by source/destination IP
tection in a different perspective. They discussed the problems addresses. The dataset contains common network traffic logs
associated with detecting anomalies in discrete sequences and such as network flow data and intrusion detection/prevention
various techniques that address these problems. They clas- system (IDS/IPS) log files, as well as network health and status
sified the sequence anomaly detection into sequence-based, data for every single workstation and server such as CPU,
contiguous sub-sequence based anomaly detection and pat- memory and disk usage. This tool basically observes trending
tern frequency-based anomaly detection. The sequence-based in the system activities in the form of peaks that show a source
anomaly detection approach is the basis for machine learning or a destination receiving or generating high volume of traffic.
based anomaly detection, where sequences of training and/or
test data are used to identify anomalies. The contiguous sub- b) Application to APT Detection:: Anomaly detection
sequence anomaly detection approach can be closely related to will greatly benefit a defense system particularly when detect-
the detection of an end system’s behavior in case of a malware ing an APT attack that is spread over several years making
download. Lastly, they explained the pattern frequency-based damage reversal quite difficult. With new malware variants
anomaly detection approach as one in which the frequency of being released every day and existing technologies such as
the sequences is higher than normal as in case of failed login rule-based analysis requiring skilled analysts to be involved
attempts. in analyzing the behavior of the malware and design rule-
Garcia-Teodoro et al. [53] have classified and reviewed based solutions to predict similar behaviors in the future, a
several anomaly-based network intrusion detection techniques gap between discovery and protection is bridging up, giving
while presenting the challenges to be addressed. Their discus- enough time for attackers to penetrate an organization’s net-
sion included statistical-based, knowledge-based and machine- work. The earlier APT attacks are detected the better would be
learning-based anomaly-based network intrusion detection the state of an organization. By automating this analysis and
techniques. They go over the need for anomaly-based detection detection part through the above anomaly detection approaches
techniques, and why a common signature-based approach and methods that continuously monitor, learn, train, and update
brings two major drawbacks: pre-defined rules often being learning models, not only can the bridging gap disappear
insufficient to detect unique or tailored attacks, and a lack but even minor changes can be detected that are difficult
of rules that verify application specific operation sequences. for human analysts to observe. Learning techniques such as
Mehmood et al. [54] summarized anomaly detection based Perceptrons, Neural Networks, Centroids, Binary Decision
system that can be implemented using machine learning Tree, Deep Learning, etc. can help process millions of data
techniques. They showed the most widely used techniques points every minute to establish normal behavior and compare
such as: (1) Support Vector Machine (SVM) which classifies data points to past behavior and identify anomalous differences
normalized data via appropriate kernels to divide data into two in values.
categories resulting anticipation between different datasets; However, to detect APT, a single anomaly and anomaly
(2) Fuzzy Logic (FL) which uses true or false to detect detection technique or method will not suffice. For instance,
anomaly behavior; (3) Genetic Algorithm (GA) which builds to detect abnormal memory usage by a process on a system,
mutation and crossover genomes, from existing or new genes, the detection system needs to know the usage history of
using heuristic search; (4) K-means which classifies data the memory by the same process, requiring either a semi-
into different clusters where each cluster presents average of supervised or a supervised approach to identify contextual
data based on provided means; (5) Artificial Neural Network anomalies. But to correlate and find similar or anomalous
(ANN) which accept different inputs and transform them until behaviors among several processes within the system and
required output is achieved; (6) Association Rule which looks across the network of systems in the organization network, an
17
unsupervised clustering approach towards identifying collec- techniques for contextually analyzing network traffic alerts
tive anomalies would be needed. However, one of the problems by splitting them into messages and attributes. However, this
with anomaly detection is the amount of false positives and approach has some limitations in terms of scalability when
false negatives, specifically in case of semi-supervised and considering attributes with many different values, it also leaks
unsupervised learning methods. The reason for this is often the interaction with the number of attributes where many
the lack of clear distinction between normal and abnormal attributes will break the interaction whereas too few attributes
data. Further, the behavior of the users and the systems is will increase the risk of missing potential alerts correlations.
not always the same, thus requiring a continuous learning and Yuan in [61] presented a preliminary study on using
incremental model updating approach. deep learning-based technique for malware detection. The
Nath et al. in [56] evaluated four machine learning methods author believes that using conventional machine learning
that have been used to statically analyze malwares and con- algorithms such as SVM, decision tree algorithm, K-NN
cluded that none of them suffice for defending against attacks cannot efficiently help due to the high false positive rates
involving multi-stage or multiple files executed in parallel or these algorithms generate. He states the reasons because,
sequential such as APT attacks. first, current malware and software are complex and diverse
In Table VI we provide high level comparison between dif- which means conventional ML models cannot capture enough
ferent anomaly-based APT attack defense methods along with features during learning phase; second, available datasets
their learning methods (supervised, unsupervised and semi- can be limited or outdated. The preliminary results in
supervised) and detection techniques (statistical, and machine this paper show that the deep learning model overcomes
learning based techniques which include rule-based & neural conventional ML models such as random forest, isolation
network approaches). forest, AdaBoosting, and eXtreme Gradient Boosting, in term
In [57], Kim et al. have used rule-based anomaly detection of accuracy. However, their model performs much slower
to detect APT attacks. Their proposed approach involves 2 than the conventional models.
stages. In the first stage, behavior rule generation, they used Siddiqui et al. in [62] also point out the high false
machine learning and decision trees based on statistical data positives obtained with the use of traditional machine
to generate behavior rules. In the second stage, abnormal learning algorithms and proposed a fractal based anomaly
behavior detection, they generate the feature description using classification algorithm to reduce both false positives and false
MapReduce based on big data and compare it against the negatives. They use K-Nearest Neighbor (K-NN) machine
behavior rules obtained in the first stage to determine if the learning algorithm and a data set that is a combination of two
behavior of a host is abnormal. different data sources to cover both APTs traffic and non-
In [58], Zhao et al. proposed a system to detect APT malicious traffic. They collected APT data sets from Contagio
malware infections. Their solution relies on the fact that APT malware database [67] and normal, non-malicious data from
malware uses Dynamic DNS to locate the C&C towards [68] and tested the combined dataset using a traditional
communicating its success in establishing foothold. Their supervised learning i.e. K-NN and correlation based fractal
solution involves 2 phases. Detection of malicious C&C dimension approach. They proved the better performance of
domains followed by analysis of the associated IPs for their approach in terms of reduced false positives and false
any suspicious and malicious traffic. The authors used J48 negative based on the fact that the correlation using fractal
decision tree algorithm as malicious DNS detector along dimension has the capability to extract multiscale hidden
with signature based detection and anomaly based detection information.
components. The result from these 3 components is passed To detect security breaches that are designed to a specific
to their reputation engine towards obtaining high accuracy in target, Cappers and Wijk in [63] claim that deep packet
identifying APT malware infections. inspection (DPI) and anomaly detection are indispensable.
Friedberg et al. [59] reviewed several works on anomaly Authors proposed an approach for network traffic analysis
based detection techniques, and proposed a novel approach where they consider visualization and machine learning
that learns the normal behavior of a system over time and techniques allowing system administrators to inspect and
report all actions that differ from the created system model. compare specific parts of the network traffic while preserving
This, they claim, is in contrast to several other solutions context. The proposed approach supports iterative refinement
that use a black-list kind of approach to detect an intrusion. of classifier parameters based on new findings inside alerts
Their proposed scheme uses log data produced by various messages (payload inspection). It uses pixel visualization to
systems and components in ICT networks from which their display the full structure of a network message as a horizontal
solution extracts a system model, that is used to detect line of pixels and to reduce false positives. Unfortunately,
and distinguish meaningful logs through event classes that this approach focuses on monitoring traffic, thus, it can only
contain implications between the events. These rules thus inspect small fractions of traffic at the same time, which
obtained describe the relations among different components makes it hard to detect threats and malicious traffic over
in the network. This model is automatically and continuously larger periods in time.
generated to detect anomalies that are consequence of realistic Dewan et al. in [64] proposed an approach to distinguish
APT attacks. between spear phishing and non spear phishing emails.
Cappers and Wijk [60] proposed an approach to find the They extracted features from spear phishing emails that have
presence of APTs in the network by using machine learning been sent to employees of 14 international organizations, by
18
using social features extracted from LinkedIn. The authors prioritize all internal clients that show suspicious activities.
performed their study on a dataset collected by Symantec’s Email spam has been known as a major method attackers
enterprise email scanning service. The authors defined nine use to launch APT. However, machine learning can help to
features that extracted from LinkedIn profile of the phishing learn valuable features from previous spams. Emails contain
email’s recipient as well as other features extracted from the text fingerprints, URLs, phone numbers, images, attachments,
emails. However, they found that the classifiers performed etc, and these can be used to train a classifier to identify
slightly worse with the feature set that includes social similar spams.
features. This is due to the limited amount from information In APT, malware can hide in multiple-layered proxy
that can be gathered from LinkedIn. network. For example, attackers can keep changing malicious
Hsieh et al. in [65] proposed a framework for detecting URLs every couple of minutes, and thus blacklisting or
APTs through monitoring active directory log data. The whitelisting does not help in preventing users from visiting
proposed framework focuses on taking active directory logs malicious URLS. However, machine learning can help in
as time-series input and mining the sequential contexts from extracting the features of those URLs and classifying future
the collected logs. Then building probability Markov model URLs into normal or malicious.
to detect different behaviors occurring (anomaly detection). Table VII summarizes the role of AI/ML techniques
In general, the proposed framework looks for the changes to defeat APTs at different APT stages. It shows also the
in user’s behavior over time through analyzing his/her challenges that are faced to each applicable AI/ML techniques.
accounts’ log data. However, their Markov-model gives the The following examples can be matched to different APT
best performance of about 66% recall rate or accuracy. This stages and how can be detected using AI/ML techniques:
can tell that anomaly detection based on analyzing active A) Spear phishing: Using supervised machine learning
directory log may be limited by information which active can help to learn valuable features from previous spams.
directory log can tell. Authors suggest that active directory Since emails usually contains text fingerprints, URLs, phone
log can be combined with other various logs or context to numbers, images, attachments, etc, then it is possible to train
enhance the accuracy of anomaly detection. a classifier on those contents and their features to predict
Marchetti et al. in [66] criticized that traditional defensive similar spear phishing emails.
solutions such as signature-based detection systems and B) Malicious DNS domains: Such as continuously
anti-viruses can only detect standard malware and are changing the IP address of the URL. This information can
ineffective against APTs. To solve APTs related threats, be detected through checking the DNS log file and find if
they propose a new framework called AUSPEX to support this URL has been linked to previous IP addresses or not.
human analysts in detecting and prioritizing weak signals Consequently, further information can be gathered to detect
related to APT activities. The proposed framework combines number of domains share the same IP address or addresses.
different techniques based on big data analytics and security In APT, malware can hide in multiple-layered proxy network.
intelligence. It gathers and combines information from For example, attackers can keep changing malicious URLs
different sources: internal information from network probes every couple of minutes, so using of blacklisting and
located in an organization, and external information from whitelisting will not suffice to prevent users from visiting
the web, social networks, and blacklists. Using network flow malicious URLs. Neural Networks has the ability to solve this
logs and access information, they focus on 3 major stages of issue by back-propagation and continuous learning. Moreover,
APTs - foothold, lateral movement, and data exfiltration, and using unsupervised machine learning to learn the features
19
of URLs and then classify new URLs to either good or bad of events, candidate events, all events that are recorded by
classes. Here, we can find out that deploying both supervised an organization logging mechanisms in any form; suspicious
and unsupervised ML techniques can increase the chance to events, events reported by security mechanisms as suspicious,
detect APTs within the second stage accomplishing a foothold or events associated with abnormal or unexpected activity and
much better than applying only blacklisting methods. attack events, events that traditional security systems aim to
In addition, clustering URLs or domains to identify DGAs detect with regard to a specific attack activity. Events are
(domain generation algorithms), which have been used by correlated using context and correlation rules which are then
malware creators to generate domains that act as rendezvous filtered through detection rules to obtain a set of possible
points with the command and control servers. This can threats. They then used risk level and confidence indicators to
contribute to detect command and control communications. evaluate the threats to attack goal. An APT incident is detected
C) User Profiling: Such as profiling set of machines that when the confidence indicator and the risk level of observed
each user logs into to find anomalous access patterns. Here events go beyond specific thresholds, which are parameters
we can use clustering techniques to profile different users and specific to an organization environment.
their expansions. From here, the system administrator can
identify if there were an privileges elevation or not. Therefore,
C. Mitigation Methods
deploying unsupervised ML techniques (clustering) can result
in detecting one of the common APT stages which is the The mitigation techniques can be broadly classified into
attack expansion. Reactive Methods and Proactive Methods.
D) Moving Data Monitoring: Such as applying deep data
analysis on moved data content such as the size of moved 1) Reactive Methods: Reactive methods identify possible
file, for example, if a user moves more than 1GB of traffic attack scenarios based on vulnerabilities currently present
within a limited time, but he/she usually dose not move in the system and perform an analysis of possible paths the
more than 100MB per the limited time, an alert should be attacker is likely to take to perform a multi-hop attack (one
triggered. Using supervised ML techniques, we can detect of the characteristics of APT).
this type of abnormal activities. Thus, the role of supervised
ML techniques here is to stop one of the major stages of A) Graph Analysis: Graph analysis is one of those fields
APT which is the data exfiltration. that is noted for its ability to support analysis of complex
E) Anomalous behavior: Detecting if, for example, CFO’s networks and identifying sophisticated attacks. Johnson et
computer makes unexpected financial transaction based on al. in [71] have proposed a novel approach to measure the
transaction’s time, destination, etc. This can be achieved by vulnerability of a cyber network through graph analysis. Their
deploying supervised ML model to learn the normal behavior solution specifically detects those attacks that involve lateral
within an organization. Thus, if a transaction is not matching movement and privilege escalations using pass-the-hash (PTH)
a known pattern, an alert should be triggered. techniques to achieve the attack goal. The attack is detected
by the use of a simple metric that measures with a graph
2) Pattern Matching: Pattern matching is an old technique how likely a node is to be reached from another arbitrary
that regular intrusion detection and prevention systems employ. node, potentially making the network vulnerable. This metric
However, this technique has its own advantages. By observing is dynamically calculated from the authentication layers during
for patterns on the behavior of a process and or application, the network security authorization phase and will enable
malicious behavior can be detected. Yan et al. in [69] proposed predictable deterrence against attacks such as PTH.
an approach to detect APT using structured intrusion detection. Attack Graph has been used as modeling tool for study
Their approach is based on high-level structured information of multi-hop attacks in a network. An attack graph can be
captured in time series of network traffic. The Helix model represented a G = {N, E}.
[70] which was originally introduced as a Natural Language • The nodes can be expressed as N = {Nf ∪ Nc ∪
Processing (NLP) for behavior recognition in mobile sensing Nd ∪ Nr }. Nf represent the fact node, e.g., access
problems was utilized in their approach. control list information hacl(V M1 , 80, V M2 , 5000). This
Giura and Wang [6] proposed a model of APT detection means V M1 and V M2 can communicate via ports
problem as well as methodology to implement it on a generic 80 and 5000. Nc represents the exploit node, e.g.,
organization network. Their solution considers three types execCode(V M1 , apache, user), which means on apache
20
web server an attacker can execute code with user priv- could be detected when it is loaded into memory or egressed
ilege. Nd denotes the privilege level, e.g., (root, V M1 ) over a network. Additionally a beacon embedded in the
and Nr depicts the goal node, e.g, (root, DatabaseServer), decoy documents that signals a remote website upon opened
i.e., gaining root privilege on the database server. for reading. If these two fail to detect a malicious insider,
• Edges can be represented by union of edges with pre- the contents of the decoy document is monitored as well.
condition and post-conditions of the exploit E = {Epre ∪ Bogus logins at multiple organizations as well as bogus and
Epost }. Here Epre ⊆ (Nf ∪ Nc ) × (Nd ∪ Nr ), which realistic bank information is monitored by external means.
means Nc and Nf must be met in order to achieve Nd . The authors classify the attacker’s sophistication level to low,
Epost ⊆ (Nd ∪Nr )×(Nf ∪Nc ). This means that condition medium, high and highly privileged and then address the
Nd is achieved on satisfaction of Nf and Nc . number of ways an attacker at the above mentioned levels can
An attack graph can be used to study the attack path taken be deceived with exception to the highly privileged attackers
in APT scenarios, as an ordered sequence of events that leads that they specify is beyond the scope of this paper. They
to compromise of the system. Another advantage of using an then explain the ways a decoy document can be designed,
attack graph is the ease of estimation of attack cost and return for instance, with embedded honeytokens, computer login
of investment (ROI) for a particular countermeasure on the accounts, network-level egress monitor that detects when
chosen path. the decoy document is transmitted, host-based monitor that
The attack graph-based security analysis can help in detects when a document is touched, embedded beacon alerts
identifying the most critical regions of the system and that alert a remote server.
severity of particular attack that can contribute to the APT Anagnotakis et al. in [81] proposed a deception framework
scenarios. Based on the type of attack, attack goals, and input that leverages virtualization and software defined networking
data, the attack graph methods discussed in Table VIII can to create unpredictable and adaptable deception environment.
be applied to the security assessment. In this paper they evaluated the current state of art of
deception networks pointing to the lack of contemporary
2) Proactive Methods: Proactive mitigation methods are technology, that does not utilize SDN or cloud technology
based on techniques which can deceive attacker or change the to deploy high-fidelity environments, lack of centralized
attack surface to increase difficulty of attack for the attacker. management, and lack of operational realism giving away the
We classify proactive methods into a) Honeypot & Honeynet; emulation to adversaries. They then discuss their proposed
and b) Moving Target Defense. framework supporting its ability to give better insights into an
A) Honeypot and Honeynet Strategies: One of the adversary’s actions by correlating the network and endpoint
characteristics of APT attacks is the level of sophistication behavior data and allow them to dynamically modify the
employed to perform the attacks. The evolving malware environment as needed.
and attack forms are quiet difficult for defenders to keep up Anagnostakis et al. in [82] proposed a novel hybrid
with. And often, a proactive approach such as a deception architecture that is a combination of the best of honeypots
technology can help them battle against the unknowns and and anomaly detection. Their system has several monitors
unexpected. In this defense methodology, defenders deceive that monitor the traffic to a protected network or service,
the attackers by creating baits in the form of decoy documents and the traffic that is considered anomalous is processed
or creating systems and or networks that are similar to the by a shadow honeypot to determine the accuracy of the
production environments but are not really part of the anomaly prediction. This shadow honeypot is an instance
organization’s production environment. Monitoring access to of the protected application that has the same state as the
such honeypots and honeynets can help organizations detect normal instance of the application, but is instrumental to
the presence of APT attackers moving across the network detect potential attacks. Attacks against the shadow honeypots
of systems in search of organization’s data after a foothold are caught and any incurred state changes are discarded.
establishment. Legitimate traffic that was misclassified by the anomaly
Bowen et al. in [80] addressed the insider threat problem detector will be validated by the shadow honeypot and
with defense by deception approach. The paper discusses will be transparently handled correctly by the system. They
how internal misuse has been one of the most damaging claim that their system has many advantages over using just
malicious activities within an organization. The authors’ an anomaly detector or honeypot as: 1) Lowers the false
proposed method attempts to trap the attackers who intend to positives as shadow honeypot needs to confirm the anomaly;
exfiltrate data and use sensitive information. The solution is 2) Since the protected application is a mirror image of the
intended to confuse and confound the attackers with decoy actual application, system can defend attacks tailored against
data that makes it difficult for them to differentiate between a specific site; 3) Protects application against client-side
original and decoy data and thus requiring more effort from attacks; and 4) Easy integration of additional detection
the attackers in order to get into a system. These decoy mechanisms. HoneyStat [13] runs sacrificial services inside
documents are automatically created and are placed on decoy a virtual machine, and monitors memory, disk and network
systems so as to entice the attackers with bogus credentials events to detect abnormal behavior. With relatively few
those when used would trigger an alert and thus giving away positives it could detect zero-day worms.
a malicious insider. Their proposition involves embedding a
watermark in binary format into the decoy documents that B) Moving Target Defense: Crouse et al. [83] discussed
21
TABLE X: Mining techniques against common features and targeted APT stages
to particular environments, e.g. to meet performance crite- scenarios and are deployed in controlled environments where
ria." The evaluation of APT attack detection methodologies no realistic noise is involved in the collected data, which is one
lacks data sets from realistic attack scenarios, and an easy of the major points APT attackers consider to stay undetected
performance evaluation and comparison is much harder than and move low and slow. Other APT studies use semi-synthetic
in other computer science domains—e.g., image categorization data that is more realistic than synthetic data, and easier to
or semantic text analysis [101]. produce than real data. However, it is simplified and biased if
It has been noticed that most current APT detection so- an insufficient synthetic user model is applied [101].
lutions evaluate their proposed methodologies using machine The second important component is the feature selection,
learning models which usually involve three major compo- which is a major aspect that affects the results when using
nents: data collection, feature extraction, and testing. The machine learning to solve a problem. Usually, the collected
data collection can be either from a real network scenario data are raw data and cannot be directly used for evaluating
or virtually manufactured (synthetic model). The real network machine learning models. Therefore, it is necessary to pre-
scenario has advantages such as realistic test basis; however, process the raw data and then select needed features. It is not
it has disadvantages such as poor scalability in terms of necessary that selected features in an APT detection solution
user input, varying scenarios, privacy issues, and an attack be used in another solution. Usually, the problem formalization
on one’s own system needed. Using a synthetic model for has an influence on this task and determines which features can
creating data allows full control of the amount of data gathered be selected. For instance, when mining and investigating log
and how the network is set up. Synthetic models create a data, the available features are not similar to those that can be
model with the desired properties, no regular noise, and no selected from network traffic or malware behavior. A feature
unknown properties. The lack of noise can be considered an is information associated with a characteristic and/or behavior
advantage when the goal is to create a model that allows simple of the object, where the feature may be static (e.g., derived
reproducibility. However, the drawback of using a synthetic- from metadata associated with the object) and/or dynamic
based model is that APT attacks are based on simplified attack (e.g., based on actions performed by the object after virtual
23
processing of the object such as detonation)[102]. Table X inquiries. In addition, provide staff with regular security
presents a list of common features against mining techniques. awareness training to outline what strategies and tactics APT
attackers can use. Therefore, educating staff is an important
VI. C HALLENGES step toward increasing the awareness and reducing the chance
The nature of the APT attacks is itself a challenge in of APT attacks.
defending against and with other parameters such as the source
of that attack, whether inside or outside, the infrastructure D. Infrastructure-based Challenges
of the defending environment, make defending against these One of the major challenges in detecting and preventing
attacks more challenging. In this section, we discuss the APT attacks is when the environment uses cloud computing
several challenges in defending against APT attacks. resources. Not only are detrimental activities such as data
exfiltration difficult to monitor, considering the ample number
A. Determined and Powerful Attackers of ways the data can be broken and sent out, but also the large
The biggest challenge in defending against APT attacks number of resources add to the difficulty of monitoring and
is the deterministic nature and the strength of the attackers. correlating events across the entire network system.
A strong defense system might be in place, but for persis-
tent attackers it all comes down to time and building more VII. R ESEARCH O PPORTUNITIES
advanced and complex tools that could bypass this defense Advanced Persistent Threats are not threats that go away
system. And these resources are plentiful for these attackers, when you have strong security in place; instead they just
enabling them to develop new malwares and custom tools to become more and more complex as the defense systems
help them achieve their goal. become stronger. Such is the persistent nature of these attacks.
As new defense techniques are developed, attackers will be
B. Long Duration of Attacks required to build advanced tools that will find them a way to
get into the system and achieve their goals, but how far they
APT attacks are often performed over a long duration of
go lies in the defense techniques. In this section, we identify
time, and while detecting the individual events is one chal-
several areas that are yet to be researched in towards defending
lenge, correlating the events over several months is another.
against APT attacks.
The state of a machine that showed suspicious behavior needs
As mentioned earlier, a successful attack would require
to be tracked for any further incidents that could be correlated
attackers to spend enough time in each of the attack stages.
to the suspicious behavior shown earlier by the machine. And
Though some solutions exist for each of these stages, there are
for a large network, this is quite a challenge not only due
several that have yet to be explored. In Table XI, we mapped
to the number of systems connected, but also due to the
some open research opportunities in APT attack stages.
false positives and incorrect leads on possible APT attack in
Spear-phishing emails, often, have a huge role and impact in
progress that the alerts triggered by those systems can cause.
an APT attack. Many times, it is through these emails that the
attackers gain a foothold in the system. An automatic detection
C. Internal Employees and correlation of these emails, and removal of these emails
As mentioned above, APT attacks involve gathering useful before the target employee opens it could prevent APT attacks
information about targeted organization such as collecting in earlier stages.
employees’ names, emails, addresses, etc. This is usually done In addition, zero-day vulnerabilities and the exploits using
using social engineering techniques which rely on the naiveté these zero-day vulnerabilities are yet to researched on. That
and/or gullibility of an organization’s employees. People are said, one other research opportunity in Stage 2 would be
known to be the weakest point in the APT kill chain. They detecting presence of fileless malware often referred to as in-
can help the attackers to achieve their goals in two ways: memory attacks. Techniques that rely on behavior analysis to
1) internal users intentionally disclose secret information to detect these in-memory attacks are being developed. However,
outside entity; or 2) by mistake, internal users provide useful the problem with behavior analysis is, for one, it is associated
information to APT attackers. with time frame, such as keeping track of, say 30 days of,
To stop or at least reduce the effectiveness of the first process behavior and rising alert when the behavior is found
point, it is important to establish clear security policies that to be not its own, and second, some processes are not easy
outline with whom employees may share information and how to make profile of such as browsers whose memory usage
that information should be transmitted. Official channels for can go up and down. Either way, these behavior analysis can
security and IT personnel to contact staff must be created, and be reverse engineered, and it will not be long before attackers
vice versa. To stop or at least reduce the effectiveness of the can manipulate the behavior analysis. This same applies to the
second point, it is important to limit information access by, for machine learning methods that are used for anomaly detection.
example, shredding company records or any documentation With enough time spent by the attackers and throwing of alerts
that includes names or other employee information. Staff from different systems across the network, it is quite possible
should be educated to not provide any information to outside that the attackers could decipher the rules and working of the
people unless that is under known and approved procedures defense system and evade detection by it when they are ready
and how they should handle phone calls, emails, and other to move ahead. A recent research by Carlini et al. in [103]
24
TABLE XI: APT Stage-based Research Opportunities traces of forensic evidence. One such method is in-memory
Stage Open Research Opportunities attacks. These are not file based, and thus give a tough time for
Stage 1 Detecting and correlating reconnaissance activities
forensic investigators on tracking their origination, and spread
Stage 2 automatic detection of spear-phishing emails and their corre-
to other systems as all that is left is that an in-memory attack
lation to events in further stages, fileless malware, Detecting has been made and possibly the script that was run.
exploitation of known and zero-day vulnerabilities
Stage 3 Detecting movement of attackers that show no anomalous
behavior, Attackers reverse engineering the behavior based VIII. D ISCUSSION
detection systems, Security risk assessment
Since the report of the first APT attack, some works have
Stage 4 Detrimental activities with use of cloud computing resources,
correlation of activities spread over a long time studied APT in terms of malware, spear-phishing attacks, or in
Stage 5 Digital forensics terms of exfiltrating data. From detecting a possible APT at-
tack through collecting reconnaissance information from social
profile activity, through establishing footholds via malware(s)
and spear-phishing emails, to detecting extraction of huge
explains how secrets can be extracted from any deep learning volumes of data, several works have studied and proposed
model, and discusses model stealing and inversion attacks that schemes for defending against only one of the stages of an
can be used to extract parameters and statistics about training APT attack. However, very few have studied and addressed
data respectively. defending APT attacks in their entirety. In this work, we
Further, investigating hacker communities can help to iden- have explored several research works focusing on individual
tify the zero-day vulnerabilities before being exploited. Ac- stages of APT as well as APT in its entirety. We reviewed
cording to [104] some vulnerabilities have been discussed by different techniques and methods used in defending against
the black-hat community before being publicly exposed by APT attacks and provided clarification on what threats are not
ethical organizations. Hackers interact and communicate with APT with several real world APT attack scenarios. In order
each other through forums, which are user-oriented platforms to ensure the novelty and new contribution of our survey, we
that have the sole purpose of enabling communication among thoroughly compare our work with existing surveys, as shown
hackers worldwide. These so called dark-web forums are in Table XII.
usually very similar to other normal web-forums; they fea- In [107], the authors have discussed APT attack stages and
ture discussions on programming, hacking, and cybersecurity suggested educating users and system administrators about the
[105], [106]. These forums provide an opportunity to hackers attack vectors as a first step, followed by implementing stricter
worldwide to exchange their discoveries, custom tools and policies and static rules and to use software tools such as
malware, etc. The existence of such hacker communities is SNORT to detect anomalies. APT attack detection is beyond a
common across various geopolitical regions, including the single tool’s capability or user awareness. Often, implementing
US, Russia, the Middle East, China, and other regions. This stricter policies is not only difficult but is also adequate. All the
presents a growing problem of global significance. Research attackers need to do is steal an account that has permissions
in this area has potential for a high social-impact [104]. from several entities that they can penetrate. Their work failed
Another area that has impact on the APT defense systems to realize the challenges in detecting an APT attack and the
is cloud computing. Cloud computing offers different types possibility of new attack vectors that evolve each day.
of services and resources that can be used to send, store In [7], the authors have analyzed several published reports of
or process data. A defense system for an organization with APT attacks and came out with the finding that spear-phishing
no cloud resources, monitors for data exfiltration activities is the most common approach chosen for initial compromise,
to an unknown or external IP. But in case of organizations and dumping credentials is the most common chosen method
having cloud resources, detecting the exfiltration activities for lateral movement. In addition, their results reveal that the
can be quiet challenging and an area to be explored due to exploited vulnerabilities as part of the APT attacks studied
the multiple cloud resources, services that can be utilized in were mostly known vulnerabilities, and exploiting zero-day
exfiltrating the data. The proposed defense system should have vulnerabilities are rarely involved. Chen and Desmet in [4]
a strong correlation model that can correlate the interlinked have studied APT attacks deeper than other contemporary
activities involved in exfiltrating the organization’s data. For works, from analysis of APT stages through case studies
instance, attackers can use the target organization’s cloud of APT attacks, countermeasures to be taken, and several
storage service to exfiltrate the data rather than send it directly detection methods to help detect APT attacks. Although their
over the organization’s network to their command and control work gives an overall idea of APT attacks, it lacks study of
center. The use of the storage service requires the attackers defending against APT attack by collecting data from different
to steal credentials of a user account on the organization’s sources.
cloud that has permissions to place or retrieve objects from In [108], Tankard et al. have studied APT attacks, explained
this storage service. Once the credentials are stolen, they can the different stages, and discussed the detection techniques for
upload the data to the storage resource, and using the same defending against APT attacks. This was one of the earlier
credentials can download the data onto their command and works of APT attacks and is a good foundation for what
control center with out being detected. APT attacks are. However, this work like others discusses
Recent developments in attack methods are leaving little the attack vectors, specifically in terms of the monitoring
25
Survey Comprehensive APT Mapping of Different APT mon- APT Recommended Challenges Research
Analysis of Attacks APT stages Measures itoring ap- detection Approaches Opportuni-
APT stages Case to attack to take proaches methods ties
Study vectors
[7] ✦ ✦ ✦
[4] ✦ ✦ ✦ ✦
[26] ✦ ✦
[107] ✦ ✦
[108] ✦ ✦ ✦ ✦
Our ✦ ✦ ✦ ✦ ✦ ✦ ✦ ✦ ✦
Survey
methods that can help in collecting data and how machine [6] P. Giura and W. Wang, “A context-based detection framework for
learning and graph analysis can be utilized to detect APT over advanced persistent threats,” in Cyber Security (CyberSecurity), 2012
International Conference on. IEEE, 2012, pp. 69–74.
a huge network, overcoming several challenges involved in [7] M. Ussath, D. Jaeger, F. Cheng, and C. Meinel, “Advanced persistent
huge volume of data analysis. threats: Behind the scenes,” in Information Science and Systems (CISS),
2016 Annual Conference on. IEEE, 2016, pp. 181–186.
[8] L. Daigle, “Whois protocol specification,” Tech. Rep., 2004.
IX. C ONCLUSION [9] A. K. Sood and R. J. Enbody, “Targeted cyberattacks: a superset of
advanced persistent threats,” IEEE security & privacy, vol. 11, no. 1,
Advanced Persistent Threats are threats that involve de- pp. 54–61, 2013.
termined and persistent well-funded attackers with goals to [10] O. S. V. D. (OSVDB), “Open source vulnerability database (osvdb),”
gain crucial data or impede critical components of their target 2012.
[11] P. Mell, K. Scarfone, and S. Romanosky, “Common vulnerability
organization or government. Unlike targeted attacks, these scoring system,” IEEE Security & Privacy, vol. 4, no. 6, 2006.
attacks involve use of sophisticated tools and/or techniques. [12] M. Motoyama, D. McCoy, K. Levchenko, S. Savage, and G. M.
In this survey, we presented to the reader a comprehensive Voelker, “An analysis of underground forums,” in Proceedings of the
2011 ACM SIGCOMM conference on Internet measurement confer-
introduction of what is the APT, what is NOT APT, and ence. ACM, 2011, pp. 71–80.
a background on how APTs are performed. We presented [13] S. Smadi, N. Aslam, L. Zhang, R. Alasem, and M. Hossain, “Detection
APT attack trees and how they can be used in a defense of phishing emails using data mining algorithms,” in Software, Knowl-
system. We then provided a taxonomy for classifying APT edge, Information Management and Applications (SKIMA), 2015 9th
International Conference on. IEEE, 2015, pp. 1–8.
defense methods that involves monitoring, detection, and miti- [14] M. Lee and D. Lewis, “Clustering disparate attacks: mapping the
gation methods. In addition, we provided technical background activities of the advanced persistent threat,” Last accessed June, vol. 26,
on current APT detection and mitigation approaches and 2013.
[15] K. Baumgartner and T. M. C. M. Golovkin, “The earliest naikon apt
evaluation techniques to evaluate the effectiveness of APT campaigns,kaspersky lab,” 2015.
attacks’ defense approaches. We finally presented noticeable [16] “Kaspersky labs - global research & analysis team, carbanak apt-the
challenges in deploying APT attacks’ defense methods before great bank robbery.”
we concluded our survey with identifying several research [17] “The duqu 2.0,” Jun. 2015.
[18] K. Baumgartner and M. Golovkin, “The naikon apt,” https://securelist.
opportunities that are worthy of investigation. com/analysis/publications/69953/the-naikon-apt/.
[19] “Kaspersky labs - global research & analysis team, equation
group:questions and answers,” Feb. 2015, available online.
X. ACKNOWLEDGMENT [20] Cylance, “Operation cleaver,” Dec. 2014, available online.
All authors are gratefully thankful for research grants [21] R. I. Response, “Shell crew,” Jan. 2014.
[22] K. L. G. R. A. Team, “The ’icefog’ apt: A tale of cloak and three
from Naval Research Lab N00173-15-G017, National Sci- daggers,” Sep. 2013.
ence Foundation – US DGE-1723440, OAC-1642031, SaTC- [23] “The regin platform - nation-state ownage of gsm networks,” Nov.
1528099, and National Science Foundation – China 61628201 2014.
[24] GROUP-IB and FOX-IT, “Anunak: Apt against financial institutions,”
and 61571375. Dec. 2014.
[25] D. Aplerovitch, “Deep in thought: Chinese targeting of national secu-
R EFERENCES rity think tanks,” Jul. 2014, http://blog.crowdstrike.com/deep-thought-
chinesetargeting-national-security-think-tanks/.
[1] D. McWhorter, “Apt1: exposing one of china’s cyber espionage units,” [26] F. Ullah, M. Edwards, R. Ramdhany, R. Chitchyan, M. A. Babar,
Mandiant. com, vol. 18, 2013. and A. Rashid, “Data exfiltration: A review of external attack vectors
[2] R. S. Ross, “Managing information security risk: Organization, mis- and countermeasures,” Journal of Network and Computer Applications,
sion, and information system view,” Special Publication (NIST SP)- 2018.
800-39, 2011. [27] X. Wang, K. Zheng, X. Niu, B. Wu, and C. Wu, “Detection of command
[3] R. Kissel, Glossary of key information security terms. Diane Publish- and control in advanced persistent threat based on independent access,”
ing, 2011. in Communications (ICC), 2016 IEEE International Conference on.
[4] P. Chen, L. Desmet, and C. Huygens, “A study on advanced persistent IEEE, 2016, pp. 1–6.
threats,” in IFIP International Conference on Communications and [28] Z. Ferrer and M. C. Ferrer, “In-depth analysis of hydraq,” The face of
Multimedia Security. Springer, 2014, pp. 63–72. cyberwar enemies unfolds. ca isbu-isi white paper, vol. 37, 2010.
[5] B. Schneier, “Attack trees,” Dr. Dobb’s journal, vol. 24, no. 12, pp. [29] R. Langner, “Stuxnet: Dissecting a cyberwarfare weapon,” IEEE Secu-
21–29, 1999. rity & Privacy, vol. 9, no. 3, pp. 49–51, 2011.
26
[30] N. Falliere, L. O. Murchu, and E. Chien, “W32. stuxnet dossier,” White [53] P. Garcia-Teodoro, J. Diaz-Verdejo, G. Maciá-Fernández, and
paper, Symantec Corp., Security Response, vol. 5, no. 6, p. 29, 2011. E. Vázquez, “Anomaly-based network intrusion detection: Techniques,
[31] A. L. Johnson, “Cybersecurity for financial institutions: The integral systems and challenges,” computers & security, vol. 28, no. 1, pp. 18–
role of information sharing in cyber attack mitigation,” NC Banking 28, 2009.
Inst., vol. 20, p. 277, 2016. [54] Y. Mehmood, U. Habiba, M. A. Shibli, and R. Masood, “Intrusion
[32] L.-X. Yang, P. Li, X. Yang, and Y. Y. Tang, “Security evaluation of detection system in cloud computing: Challenges and opportunities,”
the cyber networks under advanced persistent threats,” IEEE Access, in Information Assurance (NCIA), 2013 2nd National Conference on.
2017. IEEE, 2013, pp. 59–66.
[33] B. Bencsáth, G. Ács-Kurucz, G. Molnár, G. Vaspöri, L. Buttyán, and [55] T. Zhang, Q. Liao, and L. Shi, “Bridging the gap of network man-
R. Kamarás, “Duqu 2.0: A comparison to duqu,” Budapest. Retrieved agement and anomaly detection through interactive visualization,” in
February, vol. 27, 2015. Visualization Symposium (PacificVis), 2014 IEEE Pacific. IEEE, 2014,
[34] H. Yin, D. Song, M. Egele, C. Kruegel, and E. Kirda, “Panorama: pp. 253–257.
capturing system-wide information flow for malware detection and [56] H. V. Nath and B. M. Mehtre, “Static malware analysis using machine
analysis,” in Proceedings of the 14th ACM conference on Computer learning methods.” in SNDS. Springer, 2014, pp. 440–450.
and communications security. ACM, 2007, pp. 116–127. [57] H. Kim, J. Kim, I. Kim, and T.-m. Chung, “Behavior-based anomaly
[35] N. Virvilis and D. Gritzalis, “The big four-what we did wrong in detection on big data,” 2015.
advanced persistent threat detection?” in Availability, Reliability and [58] G. Zhao, K. Xu, L. Xu, and B. Wu, “Detecting apt malware infections
Security (ARES), 2013 Eighth International Conference on. IEEE, based on malicious dns and traffic analysis,” IEEE Access, vol. 3, pp.
2013, pp. 248–254. 1132–1142, 2015.
[36] I. Korkin and I. Nesterow, “Acceleration of statistical detection of zero- [59] I. Friedberg, F. Skopik, G. Settanni, and R. Fiedler, “Combating
day malware in the memory dump using cuda-enabled gpu hardware,” advanced persistent threats: From network event correlation to incident
arXiv preprint arXiv:1606.04662, 2016. detection,” Computers & Security, vol. 48, pp. 35–57, 2015.
[37] Z. Xu, S. Ray, P. Subramanyan, and S. Malik, “Malware detection using [60] B. C. Cappers and J. J. van Wijk, “Understanding the context of
machine learning based analysis of virtual memory access patterns,” in network traffic alerts,” in Visualization for Cyber Security (VizSec),
2017 Design, Automation & Test in Europe Conference & Exhibition 2016 IEEE Symposium on. IEEE, 2016, pp. 1–8.
(DATE). IEEE, 2017, pp. 169–174. [61] X. Yuan, “Phd forum: Deep learning-based real-time malware detection
[38] C. Vaas and J. Happa, “Detecting disguised processes using application- with multi-stage analysis,” in Smart Computing (SMARTCOMP), 2017
behavior profiling,” in Technologies for Homeland Security (HST), 2017 IEEE International Conference on. IEEE, 2017, pp. 1–2.
IEEE International Symposium on. IEEE, 2017, pp. 1–6. [62] S. Siddiqui, M. S. Khan, K. Ferens, and W. Kinsner, “Detecting
[39] M. Marchetti, F. Pierazzi, M. Colajanni, and A. Guido, “Analysis advanced persistent threats using fractal dimension based machine
of high volumes of network traffic for advanced persistent threat learning classification,” in Proceedings of the 2016 ACM on Interna-
detection,” Computer Networks, vol. 109, pp. 127–141, 2016. tional Workshop on Security And Privacy Analytics. ACM, 2016, pp.
[40] O. McCusker, S. Brunza, and D. Dasgupta, “Deriving behavior primi- 64–69.
tives from aggregate network features using support vector machines,” [63] B. C. Cappers and J. J. van Wijk, “Snaps: Semantic network traffic
in Cyber Conflict (CyCon), 2013 5th International Conference on. analysis through projection and selection,” in Visualization for Cyber
IEEE, 2013, pp. 1–18. Security (VizSec), 2015 IEEE Symposium on. IEEE, 2015, pp. 1–8.
[41] N. Villeneuve and J. Bennett, “Detecting apt activity with network [64] P. Dewan, A. Kashyap, and P. Kumaraguru, “Analyzing social and
traffic analysis,” Trend Micro Incorporated Research Paper, 2012. stylometric features to identify spear phishing emails,” in Electronic
[42] A. Vance, “Flow based analysis of advanced persistent threats detecting Crime Research (eCrime), 2014 APWG Symposium on. IEEE, 2014,
targeted attacks in cloud computing,” in Problems of Infocommuni- pp. 1–13.
cations Science and Technology, 2014 First International Scientific- [65] C.-H. Hsieh, C.-M. Lai, C.-H. Mao, T.-C. Kao, and K.-C. Lee,
Practical Conference. IEEE, 2014, pp. 173–176. “Ad2: Anomaly detection on active directory log data for insider
[43] P. Hu, H. Li, H. Fu, D. Cansever, and P. Mohapatra, “Dynamic defense threat monitoring,” in Security Technology (ICCST), 2015 International
strategy against advanced persistent threat with insiders,” in Computer Carnahan Conference on. IEEE, 2015, pp. 287–292.
Communications (INFOCOM), 2015 IEEE Conference on. IEEE, [66] M. Marchetti, F. Pierazzi, A. Guido, and M. Colajanni, “Countering
2015, pp. 747–755. advanced persistent threats through security intelligence and big data
[44] A. Bohara, U. Thakore, and W. H. Sanders, “Intrusion detection in analytics,” in Cyber Conflict (CyCon), 2016 8th International Confer-
enterprise systems by combining and clustering diverse monitor data,” ence on. IEEE, 2016, pp. 243–261.
in Proceedings of the Symposium and Bootcamp on the Science of [67] M. Parkour, “Contagio malware database,” 2013.
Security. ACM, 2016, pp. 7–16. [68] DARPA, “Darpa scalable network monitoring (snm) program traffic
[45] A. Shalaginov, K. Franke, and X. Huang, “Malware beaconing detec- (11/03/2009 to 11/12/2009),” pp. –, 2012.
tion by mining large-scale dns logs for targeted attack identification,” [69] X. Yan and J. Zhang, “Early detection of cyber security threats using
in 18th International Conference on Computational Intelligence in structured behavior modeling,” ACM Transactions on Information and
Security Information Systems. WASET, 2016. System Security, vol. 5, 2013.
[46] T.-F. Yen, A. Oprea, K. Onarlioglu, T. Leetham, W. Robertson, A. Juels, [70] H.-K. Peng, P. Wu, J. Zhu, and J. Y. Zhang, “Helix: Unsupervised
and E. Kirda, “Beehive: Large-scale log analysis for detecting sus- grammar induction for structured activity recognition,” in Data Mining
picious activity in enterprise networks,” in Proceedings of the 29th (ICDM), 2011 IEEE 11th International Conference on. IEEE, 2011,
Annual Computer Security Applications Conference. ACM, 2013, pp. pp. 1194–1199.
199–208. [71] J. R. Johnson and E. A. Hogan, “A graph analytic metric for mitigating
[47] P. Bhatt, E. T. Yano, and P. Gustavsson, “Towards a framework to detect advanced persistent threat,” in Intelligence and Security Informatics
multi-stage advanced persistent threats attacks,” in Service Oriented (ISI), 2013 IEEE International Conference on. IEEE, 2013, pp. 129–
System Engineering (SOSE), 2014 IEEE 8th International Symposium 133.
on. IEEE, 2014, pp. 390–395. [72] K. Ingols, R. Lippmann, and K. Piwowarski, “Practical attack graph
[48] W. Niu, X. Zhang, G. Yang, J. Zhu, and Z. Ren, “Identifying apt mal- generation for network defense,” in Computer Security Applications
ware domain based on mobile dns logging,” Mathematical Problems Conference, 2006. ACSAC’06. 22nd Annual. IEEE, 2006, pp. 121–
in Engineering, vol. 2017, 2017. 130.
[49] V. Hodge and J. Austin, “A survey of outlier detection methodologies,” [73] M. Albanese, S. Jajodia, and S. Noel, “Time-efficient and cost-effective
Artificial intelligence review, vol. 22, no. 2, pp. 85–126, 2004. network hardening using attack graphs,” in Dependable Systems and
[50] M. H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, “Network Networks (DSN), 2012 42nd Annual IEEE/IFIP International Confer-
anomaly detection: methods, systems and tools,” Ieee communications ence on. IEEE, 2012, pp. 1–12.
surveys & tutorials, vol. 16, no. 1, pp. 303–336, 2014. [74] S. Jha, O. Sheyner, and J. Wing, “Two formal analyses of attack
[51] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A graphs,” in Computer Security Foundations Workshop, 2002. Proceed-
survey,” ACM computing surveys (CSUR), vol. 41, no. 3, p. 15, 2009. ings. 15th IEEE. IEEE, 2002, pp. 49–63.
[52] ——, “Anomaly detection for discrete sequences: A survey,” IEEE [75] X. Ou, W. F. Boyer, and M. A. McQueen, “A scalable approach to
Transactions on Knowledge and Data Engineering, vol. 24, no. 5, pp. attack graph generation,” in Proceedings of the 13th ACM conference
823–839, 2012. on Computer and communications security. ACM, 2006, pp. 336–345.
27
[76] J. Lee, H. Lee, and H. P. In, “Scalable attack graph for risk assess- Software Defined Networks & Network Function Virtualization. ACM,
ment,” in Information Networking, 2009. ICOIN 2009. International 2018, pp. 43–48.
Conference on. IEEE, 2009, pp. 1–5. [99] G. Tang, J. Pei, and W.-S. Luk, “Email mining: tasks, common
[77] J. Homer, X. Ou, and M. A. McQueen, “From attack graphs to techniques, and tools,” Knowledge and Information Systems, vol. 41,
automated configuration management—an iterative approach,” Kansas no. 1, pp. 1–31, 2014.
State University Technical Report, 2008. [100] D. Ucci, L. Aniello, and R. Baldoni, “Survey on the usage of
[78] R. E. Sawilla and X. Ou, “Identifying critical attack assets in de- machine learning techniques for malware analysis,” arXiv preprint
pendency attack graphs,” in European Symposium on Research in arXiv:1710.08189, 2017.
Computer Security. Springer, 2008, pp. 18–34. [101] F. Skopik, G. Settanni, R. Fiedler, and I. Friedberg, “Semi-synthetic
[79] H. Huang, S. Zhang, X. Ou, A. Prakash, and K. Sakallah, “Distilling data set generation for security software evaluation,” in Privacy, Se-
critical attack graph surface iteratively through minimum-cost sat solv- curity and Trust (PST), 2014 Twelfth Annual International Conference
ing,” in Proceedings of the 27th Annual Computer Security Applications on. IEEE, 2014, pp. 156–163.
Conference. ACM, 2011, pp. 31–40. [102] T. Haq, J. Zhai, and V. K. Pidathala, “Advanced persistent threat (apt)
[80] B. M. Bowen, S. Hershkop, A. D. Keromytis, and S. J. Stolfo, “Baiting detection center,” Apr. 18 2017, uS Patent 9,628,507.
inside attackers using decoy documents.” Springer. [103] N. Carlini, C. Liu, J. Kos, Ú. Erlingsson, and D. Song, “The secret
[81] V. E. Urias, W. M. Stout, and H. W. Lin, “Gathering threat intelligence sharer: Measuring unintended neural network memorization & extract-
through computer network deception,” in Technologies for Homeland ing secrets,” arXiv preprint arXiv:1802.08232, 2018.
Security (HST), 2016 IEEE Symposium on. IEEE, 2016, pp. 1–6. [104] V. Benjamin, W. Li, T. Holt, and H. Chen, “Exploring threats and
[82] K. G. Anagnostakis, S. Sidiroglou, P. Akritidis, K. Xinidis, E. P. vulnerabilities in hacker web: Forums, irc and carding shops,” in
Markatos, and A. D. Keromytis, “Detecting targeted attacks using Intelligence and Security Informatics (ISI), 2015 IEEE International
shadow honeypots.” in Usenix Security Symposium, 2005. Conference on. IEEE, 2015, pp. 85–90.
[83] M. Crouse, B. Prosser, and E. W. Fulp, “Probabilistic performance [105] E. Nunes, A. Diab, A. Gunn, E. Marin, V. Mishra, V. Paliath, J. Robert-
analysis of moving target and deception reconnaissance defenses,” in son, J. Shakarian, A. Thart, and P. Shakarian, “Darknet and deepnet
Proceedings of the Second ACM Workshop on Moving Target Defense. mining for proactive cybersecurity threat intelligence,” in Intelligence
ACM, 2015, pp. 21–29. and Security Informatics (ISI), 2016 IEEE Conference on. IEEE, 2016,
[84] J. B. Hong and D. S. Kim, “Assessing the effectiveness of moving target pp. 7–12.
defenses using security models,” IEEE Transactions on Dependable [106] M. Almukaynizi, E. Nunes, K. Dharaiya, M. Senguttuvan, J. Shakarian,
and Secure Computing, vol. 13, no. 2, pp. 163–177, 2016. and P. Shakarian, “Proactive identification of exploits in the wild
[85] P. Kampanakis, H. Perros, and T. Beyene, “Sdn-based solutions for through vulnerability mentions online,” in Cyber Conflict (CyCon US),
moving target defense network protection,” in World of Wireless, 2017 International Conference on. IEEE, 2017, pp. 82–88.
Mobile and Multimedia Networks (WoWMoM), 2014 IEEE 15th In- [107] J. Vukalović and D. Delija, “Advanced persistent threats-detection and
ternational Symposium on a. IEEE, 2014, pp. 1–6. defense,” in Information and Communication Technology, Electronics
and Microelectronics (MIPRO), 2015 38th International Convention
[86] S. Debroy, P. Calyam, M. Nguyen, A. Stage, and V. Georgiev,
on. IEEE, 2015, pp. 1324–1330.
“Frequency-minimal moving target defense using software-defined
[108] C. Tankard, “Advanced persistent threats and how to monitor and deter
networking,” in Computing, Networking and Communications (ICNC),
them,” Network security, vol. 2011, no. 8, pp. 16–19, 2011.
2016 International Conference on. IEEE, 2016, pp. 1–6.
[87] J. H. Jafarian, E. Al-Shaer, and Q. Duan, “Openflow random host
mutation: transparent moving target defense using software defined
networking,” in Proceedings of the first workshop on Hot topics in
software defined networks. ACM, 2012, pp. 127–132.
[88] Z. Zhao, F. Liu, and D. Gong, “An sdn-based fingerprint hopping
method to prevent fingerprinting attacks,” Security and Communication
Networks, vol. 2017, 2017. Adel Alshamrani received his B.S. degree in Com-
[89] A. Chowdhary, S. Pisharody, A. Alshamrani, and D. Huang, “Dynamic puter Science from Umm Al-Qura University, Saudi
game based security framework in sdn-enabled cloud networking Arabia, the M.S. degree in Computer Science from
environments,” in Proceedings of the ACM International Workshop on La Trobe University, Melbourne, Australia, and Ph.D
Security in Software Defined Networks & Network Function Virtual- degree in Computer Science from Arizona State
ization. ACM, 2017, pp. 53–58. University, Tempe, AZ, USA in 2007, 2010, and
[90] J. Hong, “The state of phishing attacks,” Communications of the ACM, 2018 respectively. He is an Assistant Professor with
vol. 55, no. 1, pp. 74–81, 2012. the College of Computer Science and Engineering,
[91] M. Thompson, N. Evans, and V. Kisekka, “Multiple os rotational envi- University of Jeddah, Jeddah, Saudi Arabia. He has
ronment an implemented moving target defense,” in Resilient Control 8 years of combined work experience in information
Systems (ISRCS), 2014 7th International Symposium on. IEEE, 2014, security, network engineering, and teaching while
pp. 1–6. working in the Faculty of Computing and Information Technology, King
[92] C. Lei, D.-H. Ma, and H.-Q. Zhang, “Optimal strategy selection for Abdul Aziz University. His research interests include information security,
moving target defense based on markov game,” IEEE Access, vol. 5, intrusion detection, and software defined networking.
pp. 156–169, 2017.
[93] S. Neti, A. Somayaji, and M. E. Locasto, “Software diversity: Security,
entropy and game theory.”
[94] I. El Mir, A. Chowdhary, D. Huang, S. Pisharody, D. S. Kim,
and A. Haqiq, “Software defined stochastic model for moving target
defense,” in International Afro-European Conference for Industrial
Advancement. Springer, 2016, pp. 188–197.
Sowmya Myneni is a Ph.D Student in Computer
[95] A. Clark, K. Sun, L. Bushnell, and R. Poovendran, “A game-theoretic
Science at Arizona State University, Tempe, AZ,
approach to ip address randomization in decoy-based cyber defense,” in
USA. She received her M.S. in Computer Science at
International Conference on Decision and Game Theory for Security.
New Mexico State University in 2010. Her research
Springer, 2015, pp. 3–21.
interests include Network and Information Security
[96] S. Sengupta, A. Chowdhary, D. Huang, and S. Kambhampati, “Moving
specifically Intrusion Detection & Prevention, Cryp-
target defense for the placement of intrusion detection systems in the
tography, Authentication and Authorization, Wire-
cloud,” Conference on Decision and Game Theory for Security, 2018.
less Network Security. Besides being a student, she
[97] A. Chowdhary, S. Pisharody, and D. Huang, “Sdn based scalable mtd
is a certified Penetration Tester (GPEN) currently
solution in cloud network,” in Proceedings of the 2016 ACM Workshop
working as a Security Engineer.
on Moving Target Defense. ACM, 2016, pp. 27–36.
[98] A. Chowdhary, A. Alshamrani, D. Huang, and H. Liang, “Mtd analysis
and evaluation framework in software defined network (mason),” in
Proceedings of the 2018 ACM International Workshop on Security in
28