Secure Cloud Architecture - OWASP Cheat Sheet Series
Secure Cloud Architecture - OWASP Cheat Sheet Series
Introduction
This cheat sheet will discuss common and necessary security patterns to follow when creating and
reviewing cloud architectures. Each section will cover a speci9c security guideline or cloud design
decision to consider. This sheet is written for a medium to large scale enterprise system, so
additional overhead elements will be discussed, which may be unecessary for smaller organizations.
This is all necessary to properly scope the security of an architecture. However, these are subjects
that can/should be discussed in greater detail. Use the resources link below to investigate further as
part of a healthy secure architecture conversation.
Object storage usually has the following options for accessing data:
IAM Access
This method involves indirect access on tooling such as a managed or self-managed service
running on ephemeral or persistent infrastructure. This infrastructure contains a persistent control
plane IAM credential, which interacts with the object storage on the user's behalf. The method is
best used when the application has other user interfaces or data systems available, when it is
important to hide as much of the storage system as possible, or when the information
shouldn't/won't be seen by an end user (metadata). It can be used in combination with web
authentication and logging to better track and control access to resources. The key security concern
for this approach is relying on developed code or policies which could contain weaknesses.
Pros Cons
No user visibility to object storage Credential loss gives access to control plane APIs
This approach is acceptable for sensitive user data, but must follow rigorous coding and cloud best
practices, in order to properly secure data.
Signed URLs
URL Signing for object storage involves using some method or either statically or dynamically
generating URLs, which cryptographically guarantee that an entity can access a resource in storage.
This is best used when direct access to speci9c user 9les is necessary or preferred, as there is no
9le transfer overhead. It is advisable to only use this method for user data which is not very
sensitive. This method can be secure, but has notable cons. Code injection may still be possible if
the method of signed URL generation is custom, dynamic and injectable, and anyone can access the
resource anonymously, if given the URL. Developers must also consider if and when the signed URL
should expire, adding to the complexity of the approach.
Pros Cons
Minimal user visibility to object storage Anyone can access with URL
:
EUcient 9le transfer Possibility of injection with custom code
This is not an advisable method for resource storage and distribution, and should only be used for
public, non-sensitive, generic resources. This storage approach will provide threat actors additional
reconnaissance into a cloud environment, and any data which is stored in this con9guration for any
period of time must be considered publicly accessed (leaked to the public).
Pros Cons
Virtual Private Clouds (VPC) and public/private network subnets allow an application and its
network to be segmented into distinct chunks, adding layers of security within a cloud system.
Unlike other private vs public trade-offs, an application will likely incorporate most or all of these
components in a mature architecture. Each is explained below.
VPCs
VPC's are used to create network boundaries within an application, where-in components can talk to
each other, much like a physical network in a data center. The VPC will be made up of some number
of subnets, both public and private. VPCs can be used to:
Separate large components of application into distinct VPCs with isolated networks.
Create separations between duplicate applications used for different customers or data sets.
Public Subnets
Public subnets house components which will have an internet facing presence. The subnet will
contain network routing elements to allow components within the subnet to connect directly to the
:
internet. Some use cases include:
Initial touch points for applications, like load balancers and routers.
Developer access points, like bastions (note, these can be very insecure if engineered/deployed
incorrectly).
Private Subnets
Private subnets house components which should not have direct internet access. The subnet will
likely contain network routing to connect it to public subnets, to receive internet traUc in a
structured and protected way. Private subnets are great for:
Consider the simple architecture diagram below. A VPC will house all of the components for the
application, but elements will be in a speci9c subnet depending on its role within the system. The
normal Xow for interacting with this application might look like:
1. Accessing the application through some sort of internet gateway, API gateway or other internet
facing component.
2. This gateway connects to a load balancer or a web server in a public subnet. Both components
provide public facing functions and are secured accordingly.
3. These components then interact with their appropriate backend counterparts, a database or
backend server, contained in a private VPC. This connections are more limited, preventing
extraneous access to the possibly "soft" backend systems.
:
Note: This diagram intentionally skips routing and IAM elements for subnet interfacing, for simplicity
and to be service provider agnostic.
This architecture prevents less hardened backend components or higher risk services like
databases from being exposed to the internet directly. It also provides common, public functionality
access to the internet to avoid additional routing overhead. This architecture can be secured more
easily by focusing on security at the entry points and separating functionality, putting non-public or
sensitive information inside a private subnet where it will be harder to access by external parties.
Trust Boundaries
Trust boundaries are connections between components within a system where a trust decision has
to be made by the components. Another way to phrase it, this boundary is a point where two
components with potentially different trust levels meet. These boundaries can range in scale, from
the degrees of trust given to users interacting with an application, to trusting or verifying speci9c
claims between code functions or components within a cloud architecture. Generally speaking
however, trusting each component to perform its function correctly and securely, suUces. Therefore,
trust boundaries likely will occur in the connections between cloud components, and between the
application and third party elements, like end users and other vendors.
As an example, consider the architecture below. An API gateway connects to a compute instance
:
(ephemeral or persistent), which then accesses a persistent storage resource. Separately, there
exists a server which can verify the authentication, authorization and/or identity of the caller. This is
a generic representation of an OAuth, IAM or directory system, which controls access to these
resources. Additionally, there exists an Ephemeral IAM server which controls access for the stored
resources (using an approach like the IAM Access section above). As shown by the dotted lines,
trust boundaries exist between each compute component, the API gateway and the auth/identity
server, even though many or all of the elements could be in the same application.
Gureau.....TrustBoundary
Auth/Identity Ephemeral
Server [AM
Architects have to select a trust con9guration between components, using quantative factors like
risk score/tolerance, velocity of project, as well as subjective security goals. Each example below
details trust boundary relationships to better explain the implications of trusting a certain resource.
The threat level of a speci9c resource as a color from green (safe) to red (dangerous) will outline
which resources shouldn't be trusted.
1. No trust example
As shown in the diagram below, this example outlines a model where no component trusts any other
component, regardless of criticality or threat level. This type of trust con9guration would likely be
used for incredibly high risk applications, where either very personal data or important business data
is contained, or where the application as a whole has an extremely high business criticality.
Notice that both the API gateway and compute components call out to the auth/identity server. This
:
implies that no data passing between these components, even when right next to each other "inside"
the application, is considered trusted. The compute instance must then assume an ephemeral
identity to access the storage, as the compute instance isn't trusted to a speci9c resource even if the
user is trusted to the instance.
Also note the lack of trust between the auth/identity server and ephemeral IAM server and each
component. While not displayed in the diagram, this would have additional impacts, like more
rigorous checks before authentication, and possibly more overhead dedicated to cryptographic
operations.
.....Notrustthroughboundary
Auth/Identity Ephemeral
Server [AM
This could be a necessary approach for applications found in 9nancial, military or critical
infrastructure systems. However, security must be careful when advocating for this model, as it will
have signi9cant performance and maintenance drawbacks.
Pros Cons
.............Totaltrustacrossboundary
Auth/Identity Ephemeral
Server [AM
This is an unlikely architecture for all but the simplest and lowest risk applications. Do not use this
trust boundary conKguration unless there is no sensitive content to protect or eUciency is the only
metric for success. Trusting user input is never recommended, even in low risk applications.
Pros Cons
EUcient Insecure
Most applications will use a trust boundary con9guration like this. Using knowledge from a risk and
:
attack surface analysis, security can reasonably assign trust to low risk components or processes,
and verify only when necessary. This prevents wasting valuable security resources, but also limits
the complexity and eUciency loss due to additional security overhead.
Notice in this example, that the API gateway checks the auth/identity of a user, then immediately
passes the request on to the compute instance. The instance doesn't need to re-verify, and performs
it's operation. However, as the compute instance is working with untrusted user inputs (designated
yellow for some trust), it is still necessary to assume an ephemeral identity to access the storage
system.
..............Totaltrustacrossboundary
*.....******Notrustthroughboundary
Auth/Identity Ephemeral
Server [AM
By nature, this approach limits the pros and cons of both previous examples. This model will likely
be used for most applications, unless the bene9ts of the above examples are necessary to meet
business requirements.
Pros Cons
Note: This trust methodology diverges from Zero Trust. For a more in depth look at that topic, check
out CISA's Zero Trust Maturity Model.
:
Security Tooling
Web Application Firewall
Web application 9rewalls (WAF) are used to monitor or block common attack payloads (like XSS and
SQLi), or allow only speci9c request types and patterns. Applications should use them as a 9rst line
of defense, attaching them to entry points like load balancers or API gateways, to handle potentially
malicious content before it reaches application code. Cloud providers curate base rule sets which
will block or monitor common malicious payloads:
By design these rule sets are generic and will not cover every attack type an application will face.
Consider creating custom rules which will 9t the application's speci9c security needs, like:
Adding speci9c protections for chosen technologies and key application endpoints
Logging and monitoring is required for a truly secure application. Developers should know exactly
what is going on in their environment, making use of alerting mechanisms to warn engineers when
systems are not working as expected. Additionally, in the event of a security incident, logging should
be verbose enough to track a threat actor through an entire application, and provide enough
knowledge for respondents to understand what actions were taken against what resources. Note
that proper logging and monitoring can be expensive, and risk/cost trade-offs should be discussed
when putting logging in place.
Logging
Logging all layer 7 HTTP calls with headers, caller metadata, and responses
Payloads may not be logged depending on where logging occurs (before TLS termination)
and the sensitivity of data
Sending trace IDs through the entire request lifecycle to track errors or malicious actions
Legal and compliance representatives should weigh in on log retention times for the speciHc
application.
Monitoring
Anomaly alerts:
Anomalies by count and type can vary wildly from app to app. A proper understanding of what
quali9es as an anomaly requires an environment speci9c baseline. Therefore, the percentages
mentioned above should be chosen based off that baseline, in addition to considerations like risk
and team response capacity.
WAFs can also have monitoring or alerting attached to them for counting malicious payloads or (in
some cases) anomalous activity detection.
DDoS Protection
Cloud service companies offer a range of simple and advanced DDoS protection products,
depending on application needs. Simple DDOS protection can often be employed using WAFs with
rate limits and route blocking rules, while more advanced protection may require speci9c managed
tooling offered by the cloud provider. Examples include:
AWS Shield
The decision to enable advanced DDoS protections for a speci9c application should be based off
risk and business criticality of application, taking into account mitigating factors and cost (these
services can be very inexpensive compared to large company budgets).
:
Self-managed tooling maintenance
Cloud providers generally offer tooling on a spectrum of management. Fully managed services leave
very little for the end developer to handle besides coding functionality, while self-managed systems
require much more overhead to maintain.
Self-managed tooling will require additional overhead by developers and support engineers.
Depending on the tool, basic version updates, upgrades to images like AMIs or Compute Images, or
other operating system level maintence will be required. Use automation to regularly update minor
versions or images, and schedule time in development cycles for refreshing stale resources.
Managed services will offer some level of security, like updating and securing the underlying
hardware which runs application code. However, the development team are still responsible for
many aspects of security in the system. Ensure developers understand what security will be their
responsibility based on tool selection. Likely the following will be partially or wholly the responsibility
of the developer:
Use documentation from the cloud provider to understand which security will be the responsbility of
what party. Examples of this research for serverless functions:
AWS Lambda
Azure Functions
References
Secure Product Design