SOC Playbooks
SOC Playbooks
Top Security
Playbooks for 2019
WHITE PAPER
WHITE PAPER
March 2019
INTRODUCTION
The security operations center (SOC), to borrow parlance from the legendary comedian Rodney Dangerfield, doesn’t
get the respect it deserves. But anyone who understands how the beating heart of your security program functions knows
otherwise. The SOC is the regulator of the business, responsible for ensuring nothing disrupts it and that its proverbial
kingdom keys and secret sauces stay protected.
But with that great responsibility comes great pressure for SOC inhabitants, as they must successfully follow security events
from inception to resolution, while in the process overcoming key stressors endemic to a modern-day infosec command
center: skills shortages, disparate detection tools and, of course, an abundance of threats amid an even greater number
of false alarms.
Security analysts, engineers, architects and managers in the SOC are engaged in a zero-sum game where there can be only
one winner. To give the SOC team the best chance to win, they must identify, investigate and respond to threats as quickly
and consistently as possible. The key to fast and effective response is having processes documented in what is commonly
referred to as playbooks (also known as runbooks).
Cybercriminals are sophisticated – but they’re also business savvy, meaning they know what works and what doesn’t and
aren’t keen on exerting unnecessary time and energy. In fact, security operations teams have seen many times before what
cybercriminals can throw at them, but where they’ve stumbled is because of things like human error, poor prioritization
or even burnout.
Playbooks help address each of these downsides by providing security teams with a single source of truth to turn to in
high-pressure situations, helping to ensure response processes are executed systematically and repeatably. The purpose
of this document is to provide security teams with a set of dependable playbooks targeted at the most common types of
investigations undertaken by SOCs to drive down mean time to resolution. Added benefits include documenting so-called
tribal knowledge, defined as unwritten information not commonly known by others within a business, and onboarding
new analysts.
Siemplify, a leading provider of security orchestration, automation and response technology, provides these purpose-built
playbook templates, and dozens more, in its Security Operations Platform with the goal of making analysts more efficient,
engineers and architects more effective, and managers more informed. Since every organization has different needs, the
Siemplify platform makes editing existing and creating new playbooks easy thanks to the drag-and-drop playbook editor.
Now let’s get into the playbooks!
Before driving into the details of these context-driven playbooks, we first must define context in terms of security operation.
Context is something that everyone deals with daily. The best example outside of security is text messaging. While texting
is an effective way to quickly communicate with friends and family, even co-workers, one of the main issues is lack of context.
Since text messages are just text, we miss a major piece of context vital to effective communication: tone and tenor.
Many disagreements or misunderstandings can come from this lack of context. The stakes, however, are quite low when
it comes to text message misperceptions. The same can’t be said when investigating a potential threat. Context in security
means understanding how a single alert or incident fits into the overall threat landscape at any given point in time.
For instance, if an analyst investigated an alert of a failed login by User A on Computer X, they may quickly dismiss this
as a false positive – to no fault of their own. After all, people make mistakes entering their credentials all the time. However,
if the analyst had proper context that showed user A also attempted to login to 10 other machines within seconds of each
other, they would realize that this single incident was part of a coordinated brute force attack against the organization.
With this proper context, the analyst now understands that a different response is required compared to the response that
would come for a likely false positive alert. This example, while quite simple, illustrates the ultimate power of context
in investigation and response. The remainder of this paper will keep context in mind as we discuss several common
investigation types and describe the best practices regarding playbook creation and flow.
As mentioned previously, a brute force attack occurs when an attacker attempts to break into an environment by repeatedly
attempting to login to a system or systems. Generally, the attacker has either reverse engineered or purchased on the dark
web legitimate usernames and applies a vast library of potential passwords to gain access to a system. While IT departments
can implement a policy that locks out users after a given number of failed attempts, many organizations do not take this
approach as they are concerned about remote and in-the-field employees being frozen out of their computers, causing
business disruption.
Investigation
Determine which data source was used to trigger Network based logs: Here we assume that the
the alert. Is the alert based on network logs or actual product triggering the alert cannot be sure about
on-host login information? the traffic it monitors, as it might be encrypted.
In this case, we only try to find out indicator of failure.
a) Find out which ports were used for the brute force.
b) Search the IP using IP reputation sites and act
4 accordingly (continue the next steps of this
playbook to gather information for Tier 2).
Host-based logs (or logs from the last step of the previous
c) If the attack port does not match any listening
section)
service or and previously running service, mark
a) Search the logs for events indicating a ‘failure to login’ or the case as a ‘false positive’ (since there is no
‘user does not exist’ (depends on the attacked service). If chance of success without a service listening on the
such logs exist, measure the time span during which they port). Escalate the source IP’s information to Tier 2
occurred. If the time span is very short, aka mere seconds for further hunting, as this is still considered
between attempts, escalate the case to Tier 2, an an indication of a malicious presence trying its ‘luck’
indication of a malicious presence trying its ‘luck’ around around the network.
the network.
b) Search the logs for a successful login log entry. If such
entry is found, escalate the case to Tier 2 for further
vestigation.
1. Find out all users that were used in the brute force attack. Look for any suspicious username from this list and search for
other hosts that these usernames were used on.
2. Notify the owners of the legitimate accounts and the owners of the targeted machines that a brute force attempt was made
on their assets.
3. Perform a more thorough investigation on the possibly affected hosts and act accordingly.
4. If the attacker is an external IP, block it and ensure that the firewall is configured to prevent remote login attempts
for the specific port in the case it was unnecessarily open to the public.
5. If the attacker is an internal IP, search for any malware infections and past malware alerts on the source host to see
if the host is vulnerable.
Phishing is one of the most prevalent attack types organizations experience across all industries and size and has been
the source of some of the most prolific breaches of all time. It’s also the type of attack that your CEO is familiar with and
comfortable enough asking you about in the hallway. In a typical scenario, an attacker attempts to trick an unsuspecting
employee to divulge sensitive information. In the early days of phishing, the attackers would send seemingly legitimate emails
to employees with the hopes the worker would click on a link in the message and subsequently provide the proverbial keys
to the kingdom.
Today, phishing attacks are more sophisticated, ranging from email, text message, and even company executive and cloud-
based file storage/sharing site impersonation. Given the large set of threat vectors associated with phishing attacks, many
SOCs cite phishing investigations as among their largest consumers of available resources.
Investigation
1 2
Resolve all URLs to IPs and check Analyze URLs in sandboxed environment
the reputation for each IP by following the URLs 3
a) Mark ‘true positive’ if bad a) Note any suspicious behavior such as: If any of these found,
and escalate the collected file downloaded, websites that seem to mark ‘true positive’
information, otherwise. be aimed at mimicking very well-known and escalate.
sites in their appearance, opens, and
unknown redirections.
1. Once phishing is confirmed, send a security alert email to entire organization, notifying them about the targeted
activity going on.
2. Block all the malicious URLs found in the alerts (and IP addresses) with firewall.
3. Run thorough anti-malware scans against the users who received the emails (found in alerts).
Ransomware
Ransomware is a popular attack vector that involves holding an organization’s data hostage and threatening destruction
unless a ransom is paid. The threat moved into the mainstream in 2016 with the WannaCry outbreak, which affected
companies around the world.
When the victim employee accidentally installs the malicious payload, the ransomware begins to encrypt all the data on the
drive and can only be decrypted with the attacker’s key. If the victim organization pays the ransom – as many companies have
been forced to do, fearing capitulation as their only option – the attacker will provide the key to decrypt the data. However,
once the ransom is paid, the attackers may decide to stick around and target the victim again through backdoor they created.
(Digital crooks aren’t exactly known for keeping their word.)
Investigation
If the case was escalated to Tier 2, supply Tier 2 with the following extra information:
a) Binary file (for static/dynamic analysis.
b) Any infected files to determine whether business critical data was affected.
c) Victim’s IP and source IP (if the source is known – in case the alert originated from
a network-based product).
d) Initial information gathered about the malware type/family.
e) Potential impact on victim.
Command-and-control (C2) traffic confirms your worst nightmare: that your environment has one, or more, compromised
systems. If an attacker can penetrate the network and establish a communication channel to a remote server, they can
exfiltrate data in seconds.
Unfortunately, detection of C2 traffic can be difficult, especially when the adversary understands how to remain covert.
For instance, advanced attackers will limit the bandwidth and duration of communications so not to alert network monitoring
systems. Additionally, attackers may encrypt their communications, making it virtually impossible to discern the type of data
moving across – and out of – the network.
List all the IPs against which the alarm was raised for later
escalation to Tier 2.
Investigation
1
2
Gather the following details:
Identify the CnC servers:
a) Attack pattern (many to one, one to one, etc.)
a) List the destination IPs found in infected
I) Large volume of traffic outbound alerts.
machines (Source IPs) NetFlow logs.
b) Any prior malware alerts about the victim IP
b) Capture the network traffic of infected
(compromised system).
machines (source IPs).
I) If found, determine the source of malware and repeat
I) List destination IPs for any malicious
your malware investigation steps.
communications found.
II) If found, mark ‘true positive’ and escalate all the
collected information are found. c) Check reputation for destination IPs found.
While conceiving that a trusted user could become a threat to the business may be difficult, many examples exist in which a
rogue insider got the better of their employer. While most of your security controls are likely geared toward halting external
threats, the insider threat can be equally, or even more, damaging. That’s because malicious insiders don’t need to coax
anyone into giving them access to the environment. They’re already in and have free run of the network under the guise of
routine behavior.
This means they have access to a treasure trove of sensitive data and with a simple USB device can move potentially millions
of dollars’ worth of data off the network in the snap of a finger – and without raising an eyebrow.
Note the host IP against which the alert was raised (victim machine, source).
Note the suspicious attacker IP (attacker, destination).
Study the rule against which the alarm was raised and note the signature/checks in the rule.
Note the outbound traffic volume of the host within the respective time
Calculate/extract average traffic per host before the time of alert (from NetFlow).
2
3
If the volume of data from local IP (victim machine) to remote
IP (attacker’s machine) is unusually high (more than average If the volume of data from local IP is
calculated), mark ‘true positive’ and escalate. around average, mark ‘false positive’.
While these gains are obviously desirable, the real added value to the organization comes from limiting an attacker’s ability to
remain hidden in the shadows because the SOC team is buried under an avalanche of alert data.
Context-driven investigation and response is the future of modern security operations, and these top playbooks will act as an
optimal first step for putting alerts and incidents into their proper context.
Playbooks are, of course, valuable even if they are just in flowcharts that you can manually apply to your activities, as they
provide a definitive and reliable sequence to shadow during security incidents and investigations, when time is of the
essence.
But there is a way to formalize and execute these playbooks using security automation, orchestration and response (SOAR)
technology, which can ensure consistency, save you time, track and measure your progress, and provide you with machine
learning-based recommendations for best courses of action.
Siemplify playbook capabilities offer the best of both worlds: a simple user interface that makes building and editing actions,
triggers, and flows flexible, while incorporating a powerful IDE that delivers virtually unlimited customizability.