0% found this document useful (0 votes)
59 views319 pages

AWS Cloud

Uploaded by

Joe Thompson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views319 pages

AWS Cloud

Uploaded by

Joe Thompson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 319

AWS re/start

Cloud Foundations
INTRODUCTION TO COMPUTING

Application: A set of instructions that runs on a computer to perform a specific task.


- Web: runs in a web server or application server accessed from a web browser.
- Mobile: accessed from a mobile device.
- Desktop: accessed from a desktop computer (Word processor etc.).
- Internet of Things (IoT): accessed from an appliance/specialised device connected to the
internet (Car SatNav).

Code: a programming language that forms a computer program (Java, Python, C++).

Computer Components (Hardware)

Motherboard: holds all of the core computer hardware components together on a printed circuit board (PCB).
Central Processing Unit (CPU): runs the instructions it receives from applications and the OS. Performs the
basic arithmetic, logical, control and input/output (I/O) operations. Can have multiple cores (to increase
performance).
Memory: holds program instructions and data for the CPU to run/use. Memory is temporary storage (data lost
when computer turned off). Also called random access memory (RAM).
Storage Drive: used to store and retrieve digital data (documents, programs, application preferences etc.).
Drive storage is persistent (data preserved when computer turned off).
Either hard disk drive (HDD) or solid state drive (SSD).
Network Interface Card/Adapter: connects a computer to a computer network (internet).

Computer Components (Software)

Operating System (OS)


Manages a computer's hardware and software and provides the user interface (Command Line
Interface/Graphical User Interface).
Examples include: Microsoft Windows, Amazon Linux 2, Google Android.

Computer Network
Connects multiple devices to share data and resources (the internet).
Wired: using an Ethernet cable to connect to a router
Wireless: connected to router using a Wi-Fi signal
BASIC COMPUTING CONCEPTS

Server
A computer that provides data or services to other computers over a network. Differs from desktop hardware
to support more memory, multiple CPUs, redundant power supplies and smaller form factor.

Types of server
Web: used by web applications to serve Hypertext Markup Language (HTML) pages to a requesting client.
Database: hosts database software that applications use to store/retrieve data.
Mail: used to send/receive email from and to clients.

Data Centre
Hosts all of an organisation's computer and networking equipment (servers, storage devices, network devices,
cooling equipment, uninterruptible power supplies) in a physical location.

Virtual Machines
Runs on a physical computer (host). A software layer (hypervisor) provides access to the resources of the
physical computer (CPU, RAM, disk, network) to the VM. Multiple VMs can be provisioned on a single host. A
fundamental unit of computing in the cloud.

Virtualisation: the ability to create multiple VMs, each with its own OS and applications, on a single physical
machine.

Benefits
- Cost savings
- Efficiency
- Reusability and portability (able to duplicate a VM image on one or more physical hosts)

Software Development Life Cycle (SDLC)


Plan - Goals identified along with resources required to implement them
Analyse - Project requirements are defined and documented (software requirement specification) for
customer to approve.
Design - User requirements translated into a technical design and proposed in design specification document.
Develop - The code for the application is written to organisation's development standards and guidelines.
Test - to validate application components function as intended (also used to uncover and correct defects
before release).
Implement - Application is released and used in production
Maintain - Must be monitored to ensure correct operation.

Development Team Roles

Project Manager: Develops plan, recruits staff, leads and manages team.
Analyst: Defines purpose of project, gathers and organises requirements into tasks for developers.
Quality Assurance: Runs all tests and investigates any failures.
Software Developer: Writes the code that makes up the application according to the specifications.
Data Administrator: Maintains the data that is needed in the application.
WHAT IS CLOUD COMPUTING

Cloud Computing
The on-demand delivery of IT resources online with pay-as-you-go pricing. The cloud is a computer that is
located somewhere else, accessed through the internet, and used in some way. The cloud comprises server
computers in large data centres in different locations around the world. It enables organisations to not have to
build, operate, and improve infrastructure of their own.

Uses
- Backup and storage
- Content delivery with high speed worldwide
- Hosting static/dynamic websites

Cloud Service Models

Infrastructure as a Service (IaaS)


You manage the server (physical or virtual) and the OS. The data centre provider has no access to the server.

Platform as a Service (PaaS)


Someone else manages the underlying hardware and OS. You can run applications without managing the
underlying infrastructure (patching, updates, maintenance, hardware, OS).

Software as a Service (SaaS)


You manage your files while the service provider manages all data centres, servers, networks, storage,
maintenance and patching (you only handle the software).

Cloud Computing Deployment Models

Cloud
A cloud-based application that is fully deployed in the cloud (all parts of the application run in the cloud).

Hybrid
A way to connect infrastructure and applications between cloud-based resources and existing resources that
are not in the cloud. Infrastructure is often located on-premises of an organisation's data centre and the hybrid
model extends this into the cloud.
Private (on-premises)
Cloud infrastructure run from your own data centre which can use application management and virtualisation
to increase resource utilisation.

What can you do in the cloud?

You can use a cloud computing platform for the following:


•Application hosting for an on-demand infrastructure that hosts internal or SaaS applications
•Backup and storage capability to store data and build dependable backup solutions
•Content delivery to distribute content worldwide with high data transfer speeds
•Hosting static and dynamic websites
•Enterprise IT to host internal-facing or external-facing IT applications in the AWS Cloud
•Various scalable database solutions from hosted enterprise database software to non-relational database
solutions
ADVANTAGES OF CLOUD COMPUTING

How does cloud computing benefit you?


Cloud computing gives you access to servers, storage, databases, and a broad set of application services over
the internet. Cloud storage is a good example of cloud computing. Cloud storage gives you the option to free
up memory (space) on your computer or mobile device. Imagine a situation where your mobile device runs out
of memory when you want to download and save a new song, photo, or video.

If you have a business, how can cloud computing benefit your business?
Cloud computing or cloud services providers like Amazon Web Services (AWS) provide rapid access to flexible
and low-cost IT resources. With cloud computing, you don’t need to make large upfront investments in
hardware. As a business owner, you don’t need to purchase a physical location, servers, storage, or databases.

Why are so many companies interested in moving to the cloud?


Companies are moving to the cloud because it presents many benefits including cost savings because you pay
only for the resources that you use.

Trading fixed expense for variable expense

Fixed expense: Funds that a company uses to acquire, upgrade, and maintain physical assets, such as property,
industrial buildings, or equipment.
Variable expense: An expense that the person who bears the cost can alter or avoid.

By using the cloud, businesses don’t need to invest money into data centres and servers. They can pay for only
what they use, and they pay only when they use these resources (which are also known as pay as you go).
Businesses save money on technology. They can adapt to new applications with as much space as they need in
minutes instead of weeks or days. Maintenance is reduced so that the business can focus on its core goals.

Massive economies of scale


By using cloud computing, you can achieve a lower variable cost than you can get on your own. Because usage
from hundreds of thousands of customers is aggregated in the cloud, providers such as AWS can achieve
higher economies of scale. These economies translate into lower, pay-as-you-go prices.

Reduced guessing about capacity


You can reduce guessing about your infrastructure capacity needs. When you make a capacity decision before
you deploy an application, you often have either expensive idle resources or insufficient capacity. Cloud
computing reduces these problems. You can access as many or as few resources as you need, and you can
scale up and down as required with only a few minutes’ notice.

Increased speed and agility

Rapid availability of new resources


•Provision resources in minutes, not weeks.

Increased innovation
•Perform quick, low-cost experimentation.
•Use prefabricated functionality without requiring in-house expertise (such as data warehousing and
analytics).
Increased experimentation
•Explore new avenues of business with minimal risk and expense.
•Test with different configurations.

No more expenses for running and maintaining data centres


Running and maintaining data centres is expensive and time consuming. Focus on projects that differentiate
your business instead of focusing on the infrastructure. With cloud computing, you can focus on your
customers instead of focusing on the tasks of racking, stacking, and powering servers.

Going global in minutes


You can deploy your application in multiple AWS Regions around the world with a few clicks. As a result, you
can provide a lower latency and better experience for your customers and at minimal cost.
WHAT IS AWS?

AWS is a cloud services provider. AWS offers a broad set of global cloud-based products—which are also
known as services—that are designed to work together.
AWS offers three different models of cloud services: infrastructure as a service (IaaS), platform as a service
(PaaS), and software as a service (SaaS). All of these services are on the AWS Cloud.

With IaaS, you manage the server, which can be physical or virtual, and the operating system (Microsoft
Windows or Linux). In general, the data centre provider has no access to your server.

Basic building blocks for cloud IT include the following: •Networking features
•Compute
•Data storage space

With PaaS, someone else manages the underlying hardware and operating systems. Thus, you can run
applications without managing underlying infrastructure (patching, updates, maintenance, hardware, and
operating systems). PaaS also provides a framework for developers that they can build on to create
customized applications.

With SaaS, you manage your files, and the service provider manages all data centres, servers, networks,
storage, maintenance, and patching. Your concern is only the software and how you want to use it. You are
provided with a complete product that the service provider runs and manages. Facebook and Dropbox are
examples of SaaS. You manage your Facebook contacts and Dropbox files, and the service providers manage
the systems.

Comparison: On-premises and AWS infrastructure

•Access control lists (ACLs) •Amazon Elastic Block Store (Amazon EBS)
•Amazon Elastic File Store (Amazon EFS) •Amazon Machine Image (AMI)
•Amazon Relational Database Service (Amazon RDS) •Amazon Simple Storage Service (Amazon S3)
•Amazon Virtual Private Cloud (Amazon VPC) •AWS Identity and Access Management (IAM)
•Direct-attached storage (DAS) •Network access control lists (network ACLs)
•Network-attached storage (NAS) •Storage area network (SAN)
•Relational database management system (RDBMS)
What are web services?
A web service is any piece of software that makes itself available over the internet. It uses a standardized
format, either Extensible Markup Language (XML) or JavaScript Object Notation (JSON), for the request and
the response of an application programming interface (API) interaction.

AWS services

Commonly used services

Three ways to interact with AWS

AWS Management Console


The console includes an easier-to-use graphical interface. You can access the console on a mobile app.

AWS Command Line Interface (AWS CLI)


With the AWS CLI, you have access to services by discrete commands or scripts.

AWS Software Development Kits (SDKs)


Access services directly from your code (such as Java, Python, and others).
AWS Cloud Adoption Framework (AWS CAF)

AWS CAF provides the following:


•Guidelines for establishing, developing, and running AWS environments
•Perspectives in planning, creating, managing, and supporting a modern IT service
•Structure for business and IT teams to work together
FUNDAMENTALS OF AWS PRICING

AWS pricing model

Three fundamental drivers of cost with AWS

Compute
•Calculated either by the hour or the second.
•Varies by instance type

Storage
•Charged typically per GB

Data transfer
•Outbound is aggregated and charged
•Inbound has no charge (with some exceptions)
•Charged typically per GB

In most cases, you won’t be charged for inbound data transfer or for data transfer between other AWS
services in the same AWS Region.

Pay for what you use


Pay only for the services that you consume with no large upfront expenses.

Pay less when you reserve

Reserved Instances are available in three options:


1. An All Upfront Reserved Instance (AURI) provides the largest discount.
2. A Partial Upfront Reserved Instance (PURI) provides lower discounts.
3. A No Upfront Payments Reserved Instance (NURI) provides smaller discounts.

Pay less by using more

Realize volume-based discounts:


•Save more as usage increases.
•Services like Amazon Simple Storage Service (Amazon S3), Amazon Elastic Block Store (Amazon EBS), or
Amazon Elastic File System (Amazon EFS) have tiered pricing – The more you use, the less you pay per GB.
•Multiple storage services deliver lower storage costs based on needs.
Pay less as AWS grows

•As AWS grows, AWS focuses on lowering the cost of doing business.
•This practice results in AWS passing savings from economies of scale to you.
•Since 2006, AWS has lowered pricing more than 75 times and continues to do so.
•Future higher-performing resources replace current resources for no extra charge.

AWS Free Tier


You can use the AWS Free Tier to gain hands-on experience with the AWS Cloud, products, and services at no
charge. This tier is available for 1 year to new customers.

Services with no charge

Amazon Virtual Private Cloud (Amazon VPC)


AWS Elastic Beanstalk**
Amazon EC2 Auto Scaling**
AWS CloudFormation**
AWS Identity and Access Management (IAM)

**Note: Some charges might be associated with other AWS services that are used with these services.

AWS Pricing Calculator


You can use the AWS Pricing Calculator to do the following:
•Estimate monthly costs.
•Identify opportunities to reduce monthly costs.
•Use templates to compare services and deployment models.

Total cost of ownership


TCO is a financial estimate to help identify direct and indirect costs of a system.

Use TCO to do the following:


•Compare the costs of running an entire infrastructure environment or specific workload on premises as
compared to the AWS Cloud.
•Budget and build the business case for moving to the cloud.

TCO considerations
AWS INFRASTRUCTURE OVERVIEW

AWS Global Infrastructure


The AWS Global Infrastructure is designed and built to deliver a flexible, reliable, scalable, and secure cloud
computing environment with high-quality global network performance.

AWS Global Infrastructure elements

AWS data centres


A data centre is a location where the actual physical data resides and data processing occurs. AWS data
centres are built in clusters in various global Regions.

Data centres are securely designed with several factors in mind:


•Each location is carefully evaluated to mitigate environmental risk.
•Data centres have a redundant design that anticipates and tolerates failure while maintaining service levels.
•To help ensure availability, critical system components are backed up across multiple isolated locations that
are known as Availability Zones.
•To help ensure capacity, AWS continuously monitors service usage to deploy infrastructure to support
availability commitments and requirements.
•Data centre locations are not disclosed, and all access to them is restricted.
•In case of failure, automated processes move customer data traffic away from the affected area.

AWS Availability Zones

•Each Availability Zone is made up of one or more data centres.


•Availability Zones are designed for fault isolation.
•Availability Zones are interconnected with other Availability Zones by using high-speed private links.
•You choose your Availability Zones.
•AWS recommends replicating across Availability Zones for resiliency.

AWS Regions

•An AWS Region is a geographical area.


•Each Region is made up of two or more Availability Zones
•AWS has 24 Regions worldwide.
•You activate and control data replication across Regions.
•Communication between Regions uses AWS backbone network connections infrastructure.
Selecting a Region

Points of presence
AWS provides a global network of 216 PoP locations.
•The PoPs consist of 205 edge locations and 11 Regional edge caches.
•PoPs are used with Amazon CloudFront, a global content delivery network (CDN) that delivers content to end
users with reduced latency.
•Regional edge caches are used for content with infrequent access.

AWS infrastructure features

Elastic and scalable:


•Elastic infrastructure that dynamic adapts to capacity
•Scalable infrastructure that adjusts to accommodate growth

Fault-tolerant:
•Continues operating properly in the presence of a failure
•Includes built-in redundancy of components

Highly available:
•High level of operational performance with reduced downtime
AWS SERVICES AND SERVICE CATEGORIES

Storage service category

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers scalability,
data availability, security, and performance. Use it to store and protect any amount of data for
websites and mobile apps. It is also used for backup and restore, archive, enterprise
applications, Internet of Things (IoT) devices, and big data analytics.
Amazon Elastic Block Store (Amazon EBS) is high-performance block storage that is designed
for use with Amazon EC2 for both throughput-intensive and transaction-intensive workloads. It
is used for various workloads; such as relational and non-relational databases, enterprise
applications, containerized applications, big data analytics engines, file systems, and media
workflows.
Amazon Elastic File System (Amazon EFS) provides a scalable, fully managed elastic Network
File System (NFS) file system for AWS Cloud services and on-premises resources. It is built to
scale on demand to petabytes, growing and shrinking automatically as you add and remove
files. Using Amazon EFS reduces the need to provision and manage capacity to accommodate
growth.
Amazon Simple Storage Service Glacier is a secure, durable, and low-cost Amazon S3 cloud
storage class for data archiving and long-term backup. It is designed to deliver 11 9s
(99.999999999 percent) of durability and to provide comprehensive security and compliance
capabilities to meet stringent regulatory requirements.

Compute service category

Amazon Elastic Compute Cloud (Amazon EC2) provides resizable compute capacity as virtual
machines in the cloud.

Amazon EC2 Auto Scaling gives you the ability to automatically add or remove EC2 instances
according to conditions that you define.

AWS Elastic Beanstalk is a service for deploying and scaling web applications and services. It
deploys them on familiar servers such as Apache HTTP Server and Microsoft Internet

Information Services (IIS).

AWS Lambda gives you the ability to run code without provisioning or managing servers. You
pay for only the compute time that you consume, so you won’t be charged when your code
isn’t running.

Containers service category

Amazon Elastic Container Service (Amazon ECS) is a highly scalable, high-performance


container orchestration service that supports Docker containers.

Amazon Elastic Container Registry (Amazon ECR) is a fully managed Docker container registry
that facilitates storing, managing, and deploying Docker container images.
Amazon Elastic Kubernetes Service (Amazon EKS) facilitates deploying, managing, and scaling
containerized applications that use Kubernetes on AWS.

AWS Fargate is a compute engine for Amazon ECS that you can use to run containers without
managing servers or clusters.

Database service category

Amazon Relational Database Service (Amazon RDS) facilitates setting up, operating, and scaling
a relational database in the cloud. It provides resizable capacity while automating time-
consuming administration tasks, such as hardware provisioning, database setup, patching, and
backups.

Amazon Aurora is a relational database that is compatible with MySQL and PostgreSQL. It is up
to five times faster than standard MySQL databases and three times faster than standard
PostgreSQL databases.

Amazon Redshift gives you the ability to run analytic queries against petabytes of data that is
stored locally in Amazon Redshift. You can also run queries directly against exabytes of data that
are stored in Amazon S3. Amazon Redshift delivers fast performance at any scale.

Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond
performance at any scale with built-in security, backup and restore, and in-memory caching.

Networking and Content Delivery service category

Amazon Virtual Private Cloud (Amazon VPC) gives you the ability to provision logically isolated
sections of the AWS Cloud.

Elastic Load Balancing automatically distributes incoming application traffic across multiple
targets, such as Amazon EC2 instances, containers, IP addresses, and Lambda functions.

Amazon CloudFront is a fast content delivery network (CDN) service that securely delivers data,
videos, applications, and application programming interfaces (APIs) to customers globally. It has
low latency and high transfer speeds.

AWS Transit Gateway is a service that customers can use to connect their virtual private clouds
(VPCs) and their on-premises networks to a single gateway.

Amazon Route 53 is a scalable, cloud Domain Name System (DNS) web service. It is designed to
give you a reliable way to route end users to internet applications. Route 53 translates names
(like www.example.com) into the numeric IP addresses (like 192.0.2.1) that computers use to
connect to each other.
AWS Direct Connect provides a way to establish a dedicated private network connection from
your data centre or office to AWS. Using AWS Direct Connect can reduce network costs and
increase bandwidth throughput.

AWS Client VPN provides a secure private tunnel from your network or device to the AWS global
network.

Security, Identity, and Compliance service category

AWS Identity and Access Management (IAM) gives you the ability to manage access to AWS
services and resources securely. By using IAM, you can create and manage AWS users and groups.
You can use IAM permissions to allow and deny user and group access to AWS resources.

AWS Organizations permits you to restrict what services and actions are allowed in your accounts.

Amazon Cognito gives you the ability to add user sign-up, sign-in, and access control to your web
and mobile apps.

AWS Artifact provides on-demand access to AWS security and compliance reports and select
online agreements.

AWS Key Management Service (AWS KMS) provides the ability to create and manage keys. You
can use AWS KMS to control the use of encryption across a wide range of AWS services and in
your applications.

AWS Shield is a managed distributed denial of service (DDoS) protection service that safeguards
applications running on AWS.

AWS Cost Management service category

AWS Cost and Usage Report contains the most comprehensive set of AWS cost and usage data
available. It includes additional metadata about AWS services, pricing, and reservations.

AWS Budgets provides the ability to set custom budgets that alert you when your costs or usage
exceeds (or will likely exceed) your budgeted amount.

AWS Cost Explorer has an easy-to-use interface that you can use to visualize, understand, and
manage your AWS costs and usage over time.
Management and Governance service category

AWS Management Console provides a web-based user interface for accessing your AWS
account.

AWS Config provides a service that helps you track resource inventory and changes.

Amazon CloudWatch gives you the ability to monitor resources and applications.

AWS Auto Scaling provides features that you can use to scale multiple resources to meet
demand.

AWS Command Line Interface (AWS CLI) provides a unified tool to manage AWS services.

AWS Trusted Advisor helps you optimize performance and security.

AWS Well-Architected Tool provides help in reviewing and improving your workloads.

AWS CloudTrail tracks user activity and API usage.


AWS SHARED RESPONSIB ILITY MODEL

AWS security responsibilities: Security OF the cloud

AWS handles the security of the physical infrastructure that hosts your resources. This infrastructure includes
the following:
•Physical security of data centres with controlled, need-based access, located in nondescript facilities. The
physical security includes 24/7 security guards, two-factor authentication, access logging and review, video
surveillance, and disk degaussing and destruction.
•Hardware infrastructure, which includes servers, storage devices, and other appliances that AWS services rely
on.
•Software infrastructure that hosts operating systems, service applications, and virtualization software.
•Network infrastructure, which includes routers, switches, load balancers, firewalls, and cabling. This facet of
security includes nearly continuous network monitoring at external boundaries, secure access points, and
redundant infrastructure with intrusion detection.
•Virtualization infrastructure, including instance isolation.

Customer security responsibilities: Security IN the cloud

When customers use AWS services, they maintain complete control over their content. Customers are
responsible for managing critical content security requirements, including the following:
•Which content they choose to store on AWS.
•Which AWS services are used with the content.
•Which country that content is stored in.
•The format and structure of that content and whether it is masked, anonymized, or encrypted.
•Who has access to that content and how those access rights are granted, managed, and revoked.

Customers retain control of the security that they choose to protect their data, environment, applications,
AWS Identity and Access Management (IAM) settings, and operating systems. Thus, the shared responsibility
model changes depending on the AWS services that the customer decides to implement.
Service characteristics and security responsibility

Infrastructure as a service (IaaS)


•The customer has more flexibility over configuring networking and storage settings.
•The customer is responsible for managing more aspects of the security.
•The customer configures the access controls.

Platform as a service (PaaS)


•The customer doesn’t need to manage the underlying infrastructure.
•AWS handles the operating system, database patching, firewall configuration, and disaster recovery (DR)
•The customer can focus on managing code or data.

Software as a service (SaaS)


•Software is centrally hosted.
•It is licensed on a subscription model or pay-as-you-go basis.
•Services are typically accessed through a web browser, mobile app, or application programming interface
(API)
•Customers don’t need to manage the infrastructure that supports the service.
AMAZON SIMPLE STORAGE SERVICE (AMAZON S3)

•Data is stored as objects in buckets.


•Storage is virtually unlimited – A single object can be up to 5 TB.
•Amazon S3 is designed for 11 9s (99.999999999 percent) of durability.
•Customers have granular access to bucket and objects.
By default you can create up to 100 buckets.

Amazon S3 features

•You can virtually store as many objects as you want.


•By default, your data is private, and you can optionally encrypt it.
•Data is stored redundantly.
•You can retrieve data anytime from anywhere over the internet.
•Bucket names must be unique across all existing bucket names in Amazon S3.

Amazon S3 storage classes


Amazon S3 offers a range of object-level storage classes that are designed for different use cases:

•Amazon S3 Standard
Designed to provide high-durability, high-availability, and high-performance object storage for frequently
accessed data. Because it delivers low latency and high throughput, Amazon S3 Standard is appropriate for
many use cases. These use cases include cloud applications, dynamic websites, content distribution, mobile
and gaming applications, and big data analytics.

•Amazon S3 Intelligent-Tiering
Designed to optimize costs. It automatically moves data to the most cost-effective access tier without affecting
performance or operational overhead. For a small monthly monitoring and automation fee per object, Amazon
S3 monitors access patterns of the objects in Amazon S3 Intelligent-Tiering. It then moves the objects that
haven’t been accessed for 30 consecutive days to the Infrequent Access tier. If an object in the Infrequent
Access tier is accessed, it is automatically moved back to the Frequent Access tier. The Amazon S3 Intelligent-
Tiering storage class doesn’t charge retrieval fees when you use it. Also, it doesn’t charge additional fees when
objects are moved between access tiers. It works well for long-lived data with access patterns that are
unknown or unpredictable.

•Amazon S3 Standard-Infrequent Access (Amazon S3 Standard-IA)


Used for data that is accessed less frequently but requires rapid access when needed. Amazon S3 Standard-IA
is designed to provide the high durability, high throughput, and low latency of Amazon S3 Standard. With
these benefits, it also offers a low per-GB storage price and per-GB retrieval fee. This combination of low cost
and high performance makes Amazon S3 Standard-IA a good choice for long-term storage and backups. Thus,
it also works well as a data store for disaster recovery (DR) files.

•Amazon S3 One Zone-Infrequent Access (Amazon S3 One Zone-IA)


For data that is accessed less frequently but requires rapid access when needed. Unlike other Amazon S3
storage classes, which store data in at least three Availability Zones, Amazon S3 One Zone-IA stores data in one
Availability Zone. It costs less than Amazon S3 Standard-IA. Amazon S3 One Zone-IA works well for customers
who want a lower-cost option for infrequently accessed data, but don’t require the availability and resilience
of Amazon S3 Standard or Amazon S3 Standard-IA. It is a good choice for storing secondary backup copies of
on-premises data or easily re-creatable data. You can also use it as cost-effective storage for data that is
replicated from another AWS Region by using Amazon S3 Cross-Region Replication.
•Amazon Simple Storage Service Glacier
A secure, durable, and low-cost storage class for data archiving. You can reliably store virtually any amount of
data at costs that are competitive with or cheaper than on-premises solutions. To keep costs low but suitable
for varying needs, Amazon S3 Glacier provides three retrieval options that range from a few minutes to hours.
You can upload objects directly to Amazon S3 Glacier.

•Amazon S3 Glacier Deep Archive


The lowest-cost storage class for Amazon S3. It supports long-term retention and digital preservation for data
that might be accessed once or twice in a year. It is designed for customers, particularly in highly regulated
industries, such as financial services, healthcare, and public sectors. These customers typically retain datasets
for 7–10 years (or more) to meet regulatory compliance requirements. Amazon S3 Glacier Deep Archive is a
cost-effective and easy-to-manage alternative to magnetic tape systems, whether these tape systems are on-
premises libraries or off-premises services. All objects that are stored in Amazon S3 Glacier Deep Archive are
replicated and stored across at least three geographically dispersed Availability Zones. These objects can be
restored within 12 hours.

Access the data anywhere


You can access Amazon S3 through the AWS Management Console, AWS Command Line Interface (AWS CLI),
or AWS Software Development Kits (SDKs). Additionally, you can access the data in your bucket directly by
using REST-based endpoints, which support Hypertext Transfer Protocol (HTTP) or Secure HTTP (HTTPS) access.

Amazon S3 bucket and object URL structure

Redundancy in Amazon S3
When you create a bucket in Amazon S3, it is associated with a specific AWS Region. Whenever you store data
in the bucket, it is redundantly stored across multiple AWS facilities in your selected Region.
Amazon S3 is designed to durably store your data even if two AWS facilities experience concurrent data loss.

Seamless scaling
Amazon S3 automatically manages the storage behind your bucket even when your data grows. Because of
this system, you can get started immediately and have your data storage grow with your application needs.
Amazon S3 also scales to handle a high volume of requests. You don’t need to provision the storage or
throughput, and you are billed for only what you use.
Common use cases for Amazon S3

•Storage for application assets


•Static web hosting
•Backup and disaster recovery (DR)
•Staging area for big data

Amazon S3 pricing

Pay for only what you use:


•GBs per month
•Transfer out to other Regions
•PUT, COPY, POST, LIST, and GET requests

You do not have to pay for the following:


•Transfer in to Amazon S3
•Transfer out from Amazon S3 to Amazon CloudFront or Amazon EC2 in the same Region

Amazon S3 cost estimation

To estimate Amazon S3 costs, consider the following factors:

 Storage class type - Standard storage is designed for the following: » 11 9s of durability
» 4 9s of availability
- Standard-Infrequent Access (S-IA): » 11 9s of durability
» 3 9s of availability

 The amount of storage - The number and size of objects

 Requests - The number and type of requests (GET, PUT, COPY)


- Type of requests: » There are different rates for GET requests than other requests.

 Data transfer - Pricing is based on the amount of data that is transferred out of the S3 Region.
- Data transfer in is free, but you incur charges for data that is transferred out.
AMAZON ELASTIC COMPUTE CLOUD (AMAZON EC2)

AWS runtime compute choices

•Virtual machines (VMs) - Amazon EC2 & Amazon Lightsail


•Containers - Amazon Elastic Container Service (Amazon ECS)
•Platform as a service (also known as PaaS) - AWS Elastic Beanstalk
•Serverless - AWS Lambda & AWS Fargate

Amazon EC2

Common uses for EC2 instances include the following:


•Application servers
•Web servers
•Database servers
•Game servers
•Mail servers
•Media servers
•Catalogue servers
•File servers
•Computing servers
•Proxy servers

Amazon EC2 overview

•Amazon EC2 provides virtual machines—referred to as EC2 instances—in the cloud.


•With Amazon EC2, you have full control over the guest operating system (OS) —either Microsoft Windows or
Linux —on each instance.
•You can launch instances of any size into an Availability Zone anywhere in the world.
•Launch instances from Amazon Machine Images (AMIs).
•Launch instances with a few clicks or a line of code, and they are ready in minutes.
•You can control traffic to and from instances.

The EC2 in Amazon EC2 stands for Elastic Compute Cloud:


•Elastic refers to the fact that you can automatically increase or decrease the number of servers that you run
to support an application. You can also increase or decrease the size of existing servers.
•Compute refers to reason why most users run servers: to host running applications or process data. These
actions require compute resources, including processing power (central processing unit, or CPU) and memory
(random access memory, or RAM).
•Cloud refers to the fact that the EC2 instances that you run are hosted in the cloud.

Launching an EC2 instance

1. AMI
An AMI:
•Is a template that is used to create an EC2 instance (which is a virtual machine, or VM, that runs in the AWS
Cloud)
•Contains a Windows or Linux operating system
•Often also has some software preinstalled

AMI choices:
•Quick Start –Linux and Windows AMIs that are provided by AWS
•My AMIs –Any AMIs that you created
•AWS Marketplace –Preconfigured templates from third parties
•Community AMIs –AMIs shared by others; use at your own risk

AMI Benefits

Repeatability
•Use an AMI to launch instances repeatedly with efficiency and precision.

Reusability
•Instances that are launched from the same AMI are identically configured.

Recoverability
•You can create an AMI from a configured instance as a restorable backup.
•You can replace a failed instance by launching a new instance from the same AMI

2.Instance type
The instance type that you choose determines the following:
•Memory(RAM)
•Processing power(CPU)
•Disk space and disk type(storage)
•Network performance

The following are instance type categories:


•General purpose
•Compute optimized
•Memory optimized
•Storage optimized
•Accelerated computing

Instance types offer family, generation, and size.

EC2 instance type naming and sizes

Naming

•Example: t3.large
•t is the family name.
•3 is the generation number.
•Large is the size.

Sizes

Instance Name vCPU Memory (GB) Storage


t3.nano 2 0.5 EBS-only
t3.micro 2 1 EBS-only
t3.small 2 2 EBS-only
t3.medium 2 4 EBS-only
t3.large 2 8 EBS-only
t3.xlarge 4 16 EBS-only
t3.2xlarge 8 32 EBS-only
Instance type use cases

T3 - websites/applications, build servers, test and staging environments, microservices, code repositories.
C5 - scientific modelling, batch processing, ad serving, highly scaleable multiplayer gaming, video encoding.
R5 - high-performance databases, data mining, applications that perform real-time processing of big data.

3. Network settings

•Where should the instance be deployed?


•Identify the virtual private cloud (VPC) and optionally the subnet.

•Should a public IP address be automatically assigned?


•Make it internet accessible.

4. IAM role (optional)

•Does software on the EC2 instance need to interact with other AWS services?
•If yes, attach an appropriate AWS Identity and Access Management (IAM) role.

•An IAM role that is attached to an EC2 instance is kept in an instance profile.

•You are not restricted to attaching a role only at instance launch.


•You can also attach a role to an instance that already exists.
5. User data script (optional)

•Optionally, specify a user data script at instance launch.

•Use user data scripts to customize the runtime environment of your instance.
•Script runs the first time the instance starts.

•User data scripts can be used strategically


•For example, reduce the number of custom AMIs that you build and maintain.

6. Storage options

•Configure the root volume where the guest operating system is installed.

•Attach additional storage volumes (optional).


•An AMI might already include more than one volume.

•For each volume, specify the following:


•The size of the disk (in GB)
•The volume type
•Different types of solid state drives (SSDs) and hard disk drives (HDDs) are available.

•If the volume will be deleted when the instance is terminated

•If encryption should be used

Amazon EC2 storage options

Amazon Elastic Block Store (Amazon EBS):


•Amazon EBS is a service that provides durable, block-level storage volumes.
•You can stop the instance and start it again, and the data will still be there.

Amazon EC2 Instance Store:


•Ephemeral storage is provided on disks that are attached to the host computer where the EC2 instance is
running.
•If the instance stops, data that’s stored here is deleted.

Other options for storage (not for the root volume):


•Mount an Amazon Elastic File System (Amazon EFS) file system
•Connect to Amazon Simple Storage Service (Amazon S3)
7. Tags

•A tag is a label that you can assign to an AWS resource.


•It consists of a key and an optional value.

•Tagging is how you can attach metadata to an EC2 instance.

•Potential benefits of tagging include filtering, automation, cost allocation, and access control.

8. Security group

•A security group is a set of firewall rules that control traffic to the instance.
•It exists outside of the instance's guest OS.

•Create rules that specify the source and which ports that network communications can use.
•Specify the port number and the protocol, such as Transmission Control Protocol (TCP), User
Datagram Protocol (UDP), or Internet Control Message Protocol (ICMP).
•Specify the source (for example, an IP address or another security group) that is allowed to use the
rule.

9. Key pair

•At instance launch, you specify an existing key pair or create a new key pair.

•A key pair consists of the following:


•A public key that AWS stores
•A private key file that you store

•It provides secure connections to the instance.

•For Windows AMIs, use the private key to obtain the administrator password that you need to log in to your
instance.
•For Linux AMIs, use the private key to use SSH to securely connect to your instance.
EC2 instance lifecycle

Amazon EC2 pricing models

On-Demand Instances
•Pay by the hour
•No long-term commitments
•Eligible for the AWS Free Tier

Dedicated Hosts
•A physical server with EC2 instance capacity that is fully dedicated to your use

Dedicated Instances
•Instances that run in a VPC on hardware that is dedicated to a single customer

Reserved Instances
•Full, partial, or no upfront payment for the instance that you reserve
•Discount on hourly charge for that instance
•1-year or 3-year term

Scheduled Reserved Instances


•Capacity reservations available on a recurring schedule that you specify
•1-year term

Spot Instances
•Run when they are available and your bid is above the market price
•Can be interrupted by AWS with a 2-minute notification
•Include the following interruption options: terminated, stopped, or hibernated
•Can be significantly less expensive than On-Demand Instances
•Are a good choice when you have flexibility in when your applications can run

Per-second billing is available for On-Demand Instances, Reserved Instances, and Spot Instances that run
Amazon Linux or Ubuntu.
Amazon EC2 pricing models: Benefits
AWS re/start

Linux
INTRODUCTION TO LINU X

The Linux operating system

- Open source: code publicly available (users can modify/expand)


- Supports multiple users and multi-tasking
- Built to handle networking
- Provides system tools/utilities

Distributions
A Linux distribution is a packaged version of Linux that a group of individuals or a company develops. It
includes the core operating system functionality (kernel) and additional complementary tools and software
applications.
Typically downloaded and can be installed by using various formats by using an Amazon Machine Image (AMI)
for Amazon Linux 2.
Examples: Fedora (Red Hat – source of RHEL which Amazon Linux 2 is derived), Debian, and OpenSUSE.

Linux Components

Kernel: refers to the core component of an operating system. Controls everything in the operating system,
including the allocation of CPU time and memory storage to running programs, and access to peripheral
devices.
Daemons: a computer program that runs in the background and is not under the control of an interactive user.
It typically provides a service to other running programs. Process names that traditionally end with the letter d
Examples; syslogd (stores as log file), sshd (communication between client and server).
Applications: software that provides a set of functions that help a user perform a type of task or activity (word
processor/web browser/email client/media player).
Data files: contain the information that programs use and can have different types of data (music file/text
file/image file). Typically grouped in directories.
Has a name that uniquely identifies them [directoryName]fileName[.extension]
Configuration files: a special type of file that stores initial settings or important values for a system program.
These values configure the behaviour of the associated program or capture the data that the program uses.

Linux User Interface

Command Line Interface (CLI)

- Consumes fewer hardware resources than GUI


- Can be automated with scripts
- Provides more options than GUI
- Most Linux servers only use CLI

When you use the CLI, the shell that you select defines the list of commands and functions that you can run. A
shell interprets the command that you type and invokes the appropriate kernel component that runs the
command.
Bash Shell is the default Linux shell.

Graphical User Interface (GUI)

- Visual and intuitive to navigate


- Similar in Linux, Windows and MacOS
- Most user workstations use the GUI
Linux Documentation

Manual pages
Contain a description of the purpose, syntax, and options for a Linux command. You access the man pages by
using the man command.
The man command displays documentation information for the command that you specify as its argument.

This information includes the following:

- Name: The name and a brief description of the purpose of the command
- Synopsis: The syntax of the command
- Description: A detailed description the command’s usage and functions
- Options: An explanation of the command’s options

You can also search a command’s man page by using the forward slash (/) character: /<searchString>
To exit the manual pages, enter q

Linux Distributions

A Linux distribution includes the Linux kernel and the tools, libraries, and other software applications that the
vendor has packaged. Most widely used distributions are derived from the following sources:
•Fedora: Red Hat, an IBM company, mainly sponsors this distribution. Fedora issued to develop, test, and
mature the Linux operating system. Fedora is the source of the commercially distributed RHEL from which the
Amazon Linux 2 and CentOS distributions are derived
•Debian: This Linux distribution adheres strongly to the free software principles of open source. The Ubuntu
distribution is derived from Debian, and the British company Canonical Ltd. maintains it.
•OpenSUSE: The German company SUSE sponsors this distribution, which is used as the basis for the
company’s commercially supported SUSE Enterprise Linux offering.

Amazon Linux 2

Latest Linux operating system that AWS offers. It is designed to provide a stable, secure, and high-performance
runtime environment for applications that run on Amazon Elastic Compute Cloud (Amazon EC2). It supports
the latest EC2 instance type features and includes packages that facilitate integration with AWS.
LINUX COMMAND LINE

Linux Login Workflow

The Linux login workflow consists of the following three main steps:

- The user is prompted to authenticate with a user name and password.


- The user’s session settings are loaded from the user’s profile files.
- The user is presented with a command prompt in the user’s home directory.

After a network connection is made, you can connect by using a program like Putty or by using the terminal on
Mac OS.
The name is checked against the /etc/passwd file, and the password is checked against the /etc/shadow file.

NOTE: To copy and paste into the CLI use the right mouse click

Linux Command Prompt

The Linux command prompt or command line is a text interface to your Linux computer. It is commonly
referred to as the shell, terminal, console, or prompt.

The best way to describe the command syntax is as follows:

- Command: What you want Linux to do (example: man)


- Option: Modifies the command (example: -i)
- Argument: What the command acts on (example: whoami)

Useful Commands

whoami: show the current user's user name when the command is invoked
id: helps identify the user and group name and numeric IDs (UID or group ID) of the current user or any other
user on the server.
hostname: displays the TCP/IP hostname. The hostname is used to either set or display the system's current
host, domain, or node name.
uptime: indicates how long the system has been up since the last boot.
date: display the current time in a given format. It can also set the system date.
cal: used to display a calendar. If no arguments are specified, the current month is displayed.
clear: used to clear the terminal screen (clears all text on the terminal screen and displays a new prompt).
echo: places specified text on the screen. It is useful in scripts to provide users with information as the script
runs. It directs selected text to standard output or in scripts to provide descriptions of displayed results.
history: views the history file.
- It displays the current user's history file
- Up and down arrow keys cycle through the output or the history file
- This command can be run by using an event number: for example, !143

Note: If you make a mistake when writing a command, don't re-enter it. Use the history command to call the
command back to the prompt, and then edit the command to correct the mistake.

You should use the history command in the following use cases:
- Accessing history between sessions
- Removing sensitive information from the history: for example, a password that is entered
into a command argument
touch: can be used to create, change, or modify timestamps on existing files. Also used to create a new empty
file in a directory.
To create a new file using the touch command, enter touch file_example_1
cat: used to show contents of files.
stdin: Standard Input is the device through which input is normally received: for example, a keyboard or a
scanner.
stdout: Standard Output is the device through which output is normally delivered: for example, the display
monitor or a mobile device screen.
stderr: Standard Error is used to print the error messages on the output screen or window terminal.

Bash Tab Completion

the tab key automatically completes commands and file or directory names. The Bash tab saves time and
provides greater accuracy. To use this feature, you must enter enough of the command or file name for it to be
unique.
Pressing the tab key twice will show all matching options.
USERS AND GROUPS

Managing users

User accounts represent users on the system.

- User information can be stored locally or on another server accessible through a network.
- When information is stored locally, Linux stores it in the /etc/passwd file.
- Best practice is to assign one user per account (do not share accounts).

The useradd command

- Creates the user account


- Creates a home directory for the user in /home
- Defines account defaults (allows customisation at creation)
- The comment field often used to hold the user’s full name

The useradd command options

• Options allow customization of the user account at the time of creation.


• The comment field is often used to hold the user's full name.

Option Description Example


-c Comment useradd -c "new employee" jdoe
-e Account expiration useradd -e 2025-01-01 jdoe
-d Home directory path useradd -d /users/jdoe jdoe

grep: command that searches for a string in a file.


usermod: used to modify or change parts of or a whole existing user account.
userdel: deletes a user account (uses the –r option to also delete user’s home directory.
passwd: used to set user passwords (root user can reset any user password).

Managing Groups

A group is a set of accounts and a convenient way to associate user accounts with similar security needs
(easier to grant permissions to a group than to individual users).
The storage location for groups is the /etc/group file.

groupadd: Creates a new group.


groupmod: Modifies an existing group.
groupdel: Deletes an existing group.

Adding a user to a group


Modification of a user, not a group (added using usermod or gpasswd commands)
gpssword: used to administer the /etc/group file.

User Permissions

The root user has the permissions to access and modify anything on the system (administrator).
A standard user is a user that only has rights and file permissions that the user’s role requires.

Note: use caution with root. Do not log in to the system with administrative permissions. Log in as a standard
user, and then elevate permissions only when necessary (mistakes can make system inoperable).
Root user command prompt ends with #
Standard user command prompt ends with $

The su and sudo commands

su command: stands for substitute user and can be used to log in as any user (not just root user) to accomplish
administrative tasks (allows switch to root user’s environment).
Delegate specific commands to specific users by adding them to /etc/sudoers
sudo command: to run a command with one-time root permissions. sudo requires the password of the current
user whereas su requires the password of the substitute account. sudo is safer as it does not require password
exchange.

su command vs sudo command

The su command activates full administrative permissions.

- Used when all administrative permissions are needed


- Users are prompted for the root password

The sudo command activates only delegated permissions.

- Is used to delegate a specific administrative task to a specific standard user


- Users are prompted for their own password

AWS Identity and Access Management (IAM)

IAM is an AWS service that is used to manage users and access to resources. It determines who can launch,
configure, manage, and remove resources. It provides control over access permissions for people and for
systems or other applications that might make programmatic calls to AWS resources.
EDITING FILES

Vim: Command file editor (default text editor for virtually all Linux distributions)
nano: Command file editor
gedit: GUI application (requires GNOME, Xfce, or K Desktop Environment (KDE)

Vim Text Editor

Modes
- Command: Keystrokes issue commands to Vim
- Insert: Keystrokes enter content into the text file

Command Mode

Keystroke Effect
x Delete the character at the cursor
G Move the cursor to the bottom of the file
gg Move the cursor to the top of the file
42G Move the cursor to line 42 of the file
/keyword Search the file for keyword
y Yank text (cut)
p Put text (paste)
r Replace character under the cursor
i Move to insert mode
ZZ Save changes and exit Vim
dd Delete the line at the cursor
u Undo the last command
/g Global
:s/old/new/g Globally find old and replace with new
O Enter insert mode and create a line below the cursor
A Enter insert mode and enter text after the cursor
h, j, k, l Move cursor left, down, up, and right

Insert Mode (i)

Enters text into body of file


Press ESC to exit insert mode and return to command mode

Quitting and Saving

From Command mode, press : to get a command prompt for Ex mode

Keystroke Effect
:w Writes file (save)
:q Quits Vim
:wq Writes file then quits Vim
:wq! Writes file and forces quit
:q! Quits Vim without saving changes
Vim Help

Vimtutor: A tutorial of common Vim tasks


:help: Enter help
:help <keyword>: Enter help for that keyword
K: Open the man page for the word at the cursor

GNU nano Text Editor

Common nano Commands

Command Effect
CTRL+X Quit nano
CTRL+O Save the file
CTRL+K Cut text
CTRL+U Paste text
CTRL+G Get help
^G Display help text
^X Close current file buffer and exit nano
^O Write the current file to disk
^W Search for a string or a regular expression
^Y Move to previous screen
^V Move to next screen
^K Cut the current line and store it in cutbuffer
^U Uncut from cutbuffer into the current line
^C Display the position of the cursor
^_ Go to the line and column number
^\ Replace a string or a regular expression
M-W Repeat the last search
M-^ or M-6 Copy the current line and store it in the cutbuffer
^E Move to the end of the current line
M-] Move to the matching bracket
M-< or M-, Switch to the previous file buffer
M-> or M-. Switch to the next file buffer
WORKING WITH THE FILE SYSTEM

Everything in Linux is a file

File names
- They are case sensitive
- They must be unique within the directory
- They should not contain / or spaces

Extensions
Optional and not necessarily mapped to applications

File Systems
A way of naming, retrieving, and organising data on the storage disk. A file is located inside a directory.

The file system: - Is case sensitive


- Has the key-like directories: /
/home
/mnt

File System Hierarchy Standard (FHS)

Directory Function
/ Root of the file system
/boot Boot files and kernel
/dev Devices
/etc Configuration files
/home Standard users' home directories
/media Removable media
/mnt Network drives
/root Root user home directory
/var Log files, print spool, network services

Commands for managing files and directories

ls command displays a list of files in a directory (ls is a Lower case L and S)

ls command options

Option Description
-l Long format (shows permissions)
-h File sizes reported in a human-friendly format
-a Shows all files, including hidden files
-R Lists subdirectories
--sort=extension or -x Sorts alphabetically by file extension
--sort=size or -s Sorts by file size
--sort=time or -t Sorts by modification time
--sort=version or -v Sorts by version number

You can combine options: ls -al displays hidden files and file details.
Exploring files and directories

Command Description
cat Shows contents of a file
cp Copies a file
rm Removes a file
mkdir Creates a directory (several with one command: mkdir dir1 dir2 dir3)
mv Moves a file from one directory to another: mv source destination
rmdir Deletes existing empty directories: rmdir <DirectoryName>
If directory not empty, use rm -r <DirectoryName>
pwd Current location in the file system
less Scroll backwards through file
more Scroll forwards through a file

cp command

Option Description
cp -a Archive files
cp -f Force copy by overwriting the destination file if needed
cp -i Interactive - Ask before overwrite
cp -l Link files instead of copy
cp -n No file overwrite
cp -u Update - copy when source is newer than destination

rm command

Option Description
rm -d Removes a directory; the directory must be empty: rm -d dir
rm -r Allows you to remove a non-empty directory: rm -r dir
rm -f Never prompt user (useful when deleting a directory with many files)
rm -i Prompts user for confirmation for each file
rm -v Display the names of deleted files

Absolute versus Relative Paths

Paths define directories to be traversed to get to a particular resource


GUI = by opening directories
CLI = specify directories by name

Absolute: The complete path to the resource from the root of the file system (shows entire folder structure)
Relative: The path to the resource from the current directory

cd command
Used to move from one directory to another
Use ../ to go up a single directory at a time
WORKING WITH FILES

Important Commands

hash: Used to see a history of programs and commands that are run from the command line Information is
maintained by command in a hash table.
Syntax: hash [-lr] [-p pathName] [-dt] [commandName ...]

cksum: Verifies that a file has not changed.


Generates a checksum value for a file or stream of data. If the CRC (cyclic redundancy check) value is the same
before and after transfer, the file was not corrupted.
Syntax: cksum [FileName]

find: Searches for files by using criteria such as the file name, the size, and the owner.
Syntax: find [directory to start from][options][what to find]

grep: Searches a file's contents for a given string or text pattern.


Syntax: grep <text pattern or string> <where to search>

diff: Is used to quickly see the difference between two files.


Syntax: diff [options] File1 File2

ln: Creates pointers to a given file.

tar: Bundles multiple files into one file (created bundle is a tarball): tar -cvf tarball.tar <file1 file2>

gzip: Compresses or decompresses a file's size (including tarballs).

zip: Compress the contents of a file.

unzip: Extracts the contents of a file.

Links
Can use links to refer to the same file by using different names and to access the same file from more than one
location in the file system.

Every file has an inode object that uniquely identifies its data location and attributes. Identified with a unique
number.

Two types of links:

Hard Link - Points to the original file's inode. If the file is deleted, the data still exists until the hard link is
deleted.
Symbolic - Points to the original file name or a hard link. If the file is deleted, the soft link is broken until you
create a new file with the original name.
MANAGING FILE PERMISSIONS

Permission Types

Read (r): Gives the user control to open and read a file
Write (w): Gives the user the control to change the file's content
Execute (x): Gives the user the ability to run a program
(-) : The file type indicates the regular file in the directory

Permission Modes

Absolute: Use numbers to represent file permissions (most commonly used)

Symbolic: A combination of letters and symbols to add permissions or to remove any set permissions.

Note: with the chmod command, the user can change the permissions on a file.
chmod allows the user to set permissions for files and directories.

Using the ls -l command to view permissions

ls command is used to list files and directories. Option -l (lowercase L) shows the file or directory, size,
modified date and time, file or folder name, and owner of the file and its permissions.

Ownership
A user is the owner of the file. Every file in the system belongs to one user name/file owner.

User: can create a new file or directory. Ownership is set to the user ID of the user who created the file.
Group: can contain multiple users. Users who belong to that group will have the same permissions to access
the file.
Other: means the user did not create the file and does not belong to a user group that could own the file.

Default Permissions
The root account has superuser privileges and can run commands. Non-root users can perform or run
commands that are similar to root users who use the sudo command.

chown command
The user (owner) and associated group for a file or directory can be changed using the chown command to
change ownership.
[options] - The chown command can be used with or without options
[user] - indicates the user name/ID of the new owner of the file or directory being altered
[:] - use when changing a group of the file or directory
[group] - changing the ownership of the associated group is optional
[file(s)] - the source file or directory that will be altered

chmod command
The command that is used to change permissions is the chmod command. The change mode or chmod
command is used to set permissions on files and directories.

chmod in symbolic mode

Identity Permission Operator


u (user) r (read) + Grants a permission
g (group) w (write) - Removes a permission
o (other) x (execute) = Removes a permission and sets a new one

chmod in absolute mode

Permission Value
Read 4
Write 2
Execute 1
All permissions 7

Best Practices for Managing File Permissions


- Do not use chmod 777 (grants read, write, and execute permissions to every user).
- Follow the principle of least permissions (least number users the least amount of access first)
- Limit file names to alphanumeric characters, dots, and dashes
WORKING WITH COMMAND S

Special Characters, Wildcards, and Redirection


Quotation marks (" ") override the usual Bash interpretation of an argument with a space as two separate
arguments (without quotation marks Bash will try to recognise all as command parameters).

Bash Metacharacters
Special characters that have a meaning to the shell and that users can use to work faster and more powerful
interactions with Bash (especially useful when writing scripts).
Control output, wildcards, and chaining commands

Metacharacter Description
* (star) Any number of any character (wildcard)
? (hook) Any one character (wildcard)
[characters] Any matching characters between brackets (wildcard)
`cmd` or $cmd Command substitution - uses backticks (`), not single quotation marks (' ')
; Chain commands together, all written on a single line
~ Represents the home directory of the user
- Represents the previous working directory

Redirection operators

Operator Description

> Sends the output of a command to a file (by default will overwrite existing file
content)
< Receives the input for a command from a file
| Runs a command and redirects its output as input to another command. You can
chain several commands by using pipes (multi-stage piping), which is referred to as a pipeline.
>> Appends the output of a command to the existing contents of a file
2> Redirects errors that are generated by a command to a file

2 Appends errors that are generated by a command to the existing contents of a file

Command Substitution (`)


Allows a command to be nested in a command line or within another command. Result of that command is
displayed or used by the rest of the command.

Using | grep
grep is commonly used after another command, along with a pipe ( | ). Used to search the piped output of a
previous command.

cut command
Cuts sections from lines of text by character, byte position, or delimiter.

Text Manipulation and Searching

sed command
Non-interactive text editor and edits data based on the rules that are provided (can insert, delete, search, and
replace).
sort command
Sorts file contents in a specified order: alphabetical, reverse order, number, or month.
awk command
Used to write small programs to transform data
MANAGING PROCESSES

Programs
A series of instructions given to a computer that tells which actions the computer should take.

How a program is found

System: - Primal computer functions


- Operating system commands
- Usually do not interface with the computer user
- Native computer function

Application: - Comprehensive program that performs a specific function


- Can be used by a user or another program
- Word processors, and games are examples
- Added to the computer

Process
Is a running program and identified by process ID number (PID).

States of a process

Child Processes (sub processes)


Some services and applications are complex and require more than one process to provide more functionality.
These spawn child process and inherit most of the attributes of the parent process.

Basic Commands for Process Management

ps command
The ps (process status) command gives an overview of the current processes that will be running in the OS.
Displays: - Process ID (PID)
- Terminal type (TTY)
- Time process has been running
- Command (CMD) which launched the process

Use ps [options] command to filter the information of the active processes.

ps –ef | grep <process_name>


Common ps command options

Option Description
-e List all current processes
-b Use batch mode
-fp <number> List processes by PID

pidof command
Shows the PID (process ID) of the current running program (pidof sshd will show the PID of sshd).

pstree command
Displays the current running processes in a tree format and merges identical branches denoted by [ ] square
brackets and child processes that are under the parent processes as denoted by { } curly brackets.
Example: pstree [options] [pid, user]

top command
You can use the top command to examine the processes that run on a server and to examine resource
utilization. Displays a real-time system summary and information of a running system. It displays information
and a list of processes or threads being managed.

Options Description
-h and -v Displays usage and version information
-b Starts top in Batch mode

Task status in top


Running: is a process that is running on the CPU or present in the run queue
Sleep: is a process that is waiting for an I/O operation to complete
Stopped: is a process that has been stopped by a job control signal or that is being traced
Zombie: is a child process whose parent process has been ended

kill command
Explicitly ends processes usually when the process fails to quit on its own.

The following are kill command signals that you can use:
• -9 SIGKILL–Stops any process immediately with no graceful exit
• -15 SIGTERM–Exits without immediately terminating
• -19 SIGSTOP–Pauses the process and can use the command line

nice and renice commands


nice: manages processes that are scheduled to be run at specific times on the CPU (manges schedule priority).
renice: Adjusts the priority of a process (when you need to adjust or modify the scheduling priority of a
running process.

at and cron commands


at: runs a task once at a specified time (one-off task)
cron: runs a task on a regukar basis at a specified time (repetitive tasks)

crontab command
Stands for cron table, is made up of a list of commands, and is also used to manage this table (minute, hour,
day of the month, month of the year, day of the week, command).
MANAGING SERVICES

systemctl command
Has many subcommands , including status, start, stop, restart, enable, and disable. Services provide
functionality such as networking, remote administration, and security.

Troubleshooting tasks: - Restart after any configuration change


- Restart when troubleshooting

Restarting the entire server would mean that the reboot would also stop all the properly running services on
the server. Restarting only the failing service means that the healthy services can continue to run.

sytemctl <subcommand> <service name>

Show running services systemctl


List all services (active/exited/failed) systemctl list-units --type=service
List all active services systemctl list-units --type=service --state=active

Monitoring Services

Command Description
lscpu List CPU information
lshw List hardware
du Check file and directory sizes
df Display disk size and free space (df -h displays in human-readable format)
fdisk List and modify partitions on the hard drive
vmstat Indicate use of virtual memory
free Indicate use of physical memory
top Display system’s processes and resource usage (use to determine what
process is responsible for high CPU usage)
uptime Indicate the amount of time that the system has been up, number of users, and central
processing unit (CPU) wait time

Amazon CloudWatch

AWS CloudWatch monitors the health and performance of your AWS resources and applications.
•It offers monitoring of Amazon Elastic Compute Cloud (Amazon EC2) instances, such as CPU usage, disk reads,
and writes.
•You can create alarms. For example, when CPU usage exceeds a certain threshold, a notification is sent
through Amazon Simple Notification Service (Amazon SNS).
THE BASH SHELL

What is a shell?

- The primary purpose of a shell is to allow the user to interact with the computer operating system
- A shell accepts and interprets commands
- A shell is an environment in which commands, programs, and shell scripts are run
- Bash is the default Linux shell

Shell variables

A variable is used to store values. Variables can be a name, a number, or special characters; by default,
variables are strings. Scripts or other commands can call shell variables. The values that these shell variables
represent are then substituted into the script or command.

Syntax rules: Variable syntax structure

By convention and as a good practice, the name of a variable that a user has created is in lowercase.
Environment (system) variable names are capitalized. Also, there is no space before or after the equ the
variable must contain no spaces or special characters within the variable name. A variable name can contain
only letters (a to z or A to Z), numbers (0 to 9), or the underscore character ( _).

Assigning a value to a variable

A value can be assigned as a number, text, file name, device, or other data type. Variables are assigned by
using the = operator. The value of the variable is located to the right of the = operator.

Displaying shell variables

To display the value of a variable, use the echo $VARIABLE_NAME. Also use the echo command to view the
output from environment variables or system-wide variables.

Environment variables

Environment variables are structurally the same as shell variables; they are no different from each other. Both
use the key-value pair, and they are separated by the equal (=) sign.
Common Environment Variables

Environment Variables Description


$HOME Defines the user's home directory
$PATH Indicates the location of the commands
$SHELL Defines the login shell type
$USER Contains the user's system name

env command

The env command is a shell command for Linux and displays the environment variable. You use this command
to display your current environment and can be useful in testing.

alias command

By using aliases, you can define new commands by substituting a long command with a short one. Aliases can
be set temporarily in the current shell, but it is more common to set them in the user's .bashrc file so that they
are permanent.

How it works: Enter the command alias, desired alias, and then the command to run. Ensure the value of the
command in single quotation marks

unalias command

The unalias command removes the configured alias if it is not configured in the .bashrc file.
BASH SHELL SCRIPTS

What are scripts

- Scripts are text files of commands and related data


- When the text file is processed, the commands are run
- Scripts can be set as scheduled tasks by using cron
- Automation allows scripts to run more quickly than if they are run manually
- Scripts are consistent due to automation removing the potential for manual errors
- Common script tasks: creating backup jobs, archiving log files, configuring systems and services,
simplifying repetitive tasks, and automating tasks

Shell scripts

Basic scripting syntax

The # character

Bash ignores lines that are preceded with #. The #character is used to define comments or notes to the user
that might provide instructions or options

#!/bin/bash and #comments

- The first line defines the interpreter to use (it gives the path and name of the interpreter)
- Scripts must begin with the directive for which shell will run them
- The location and shell can be different
- Each shell has its own syntax, which tells the system what syntax to expect

Useful commands

Command Description
echo Displays information on the console
read Reads a user input
subStr Gets the substring of a string
+ Adds two numbers or combine strings
file Opens a file
mkdir Creates a directory
cp Copies files
mv Moves or renames files
chmod Sets permissions on files
rm Deletes files, folders, etc.
ls Lists directories

You can use all the shell commands that you saw earlier in the course, such as grep, touch, and redirectors ( >,
>>, <, << )
Arguments

Arguments are values that you want the script to act on and are passed to the script by including them in a
script invocation command separated by spaces.

Expressions

Expressions are evaluated and usually assigned to a variable for reference later in the script.

Conditional statements

Conditional statements allow for different courses of action for a script depending on the success or failure of
a test.
Shell scripts with conditional statements can be used when:

- The script asks the user a question that has only a few answer choices
- Deciding whether the script must be run
- Ensuring that a command ran correctly and taking action if it failed

Logical control statements

if statement

If the first command succeeds with an exit code of 0 (success), then the subsequent command runs. An if
statement must end with the fi keyword.

if <condition>; then <command>; fi

if - else statement

Defines two courses of action:

- If the condition is true (then)


- If the condition is false (else)

if <condition>; then <command>; else <other command>;fi

if - elif - else statement

Defines three courses of action:

- If the condition is true (then)


- Else if the condition is false but another condition is true (then)
- If all of the previous conditions are false (else)

if <condition>; then <command>; elif <other condition>; then <other command>; else <default command>;fi

test command

- Checks file types and compare values


- Conditions are tested, and then the test exits with a 0 for true and a 1 for false
- Syntax: test <EXPRESSION>

Loop statements

Loops provide you the ability to repeat sections of a script.


Loops can end:

- After a specific number of repeats (for statement) - bracketed by do and done


- Until a condition is met (until statement) - bracketed by until and done
- While a condition is true (while statement) - bracketed by while and done

Loop control statement: Break

Stop running the entire loop (exits before the condition of the loop is met because the counter reaches the
value passed as a parameter).

Loop control statement: Continue

Terminates the current loop iteration and returns control back to the top of the loop.

The read command

Read incorporates the user’s input as a variable into a script.

The true and false commands

Are used with loops to manage their conditions. These commands return predetermined exit status (either a
status of true or a status of false).

The exit command

Causes script to stop running and exit to the shell

- Useful in testing
- Can return code status. Each code can be associated to a specific error
o For example: exit 0: The program has completed without any error
exit 1: The program has an error
exit n: The program has a specific error.
- $? is a command to get the processing status of the last command that ran.

Security! Check that the script contains only the required functionality
Test! Test all scripts to confirm that they function as expected
SOFTWARE MANAGEMENT

Managing software

The approach for managing software varies depending on the Linux distribution type. Features such as the
software package format and the utility tools used to install, update, and delete packages are different
depending on the source of the distribution.

Red Hat Method

- Red Hat Package Manager (RPM) is used to manage software


- Software packages have an .rpm file extension
- The YUM utility is a commonly used front-end interface to RPM
- Amazon Linux 2, Red Hat Linux, and CentOS use this method

Debian Method

- The dpkg package manager is used to manage software


- Software packages have a .deb file extension
- Advanced Package Tool (APT)is often used as a front end
- Debian and Ubuntuuse this method

Install from source code

- A GNU Compiler Collection(GCC) compiler is used to compile the code


- Compiler turns human-readable code into machine-readable code
- Compiled package can then be installed

Package managers and packages

A package manager installs, updates, and deletes software that is bundled in a package. A package contains
everything that is needed to install the software, including the precompiled code, documentation, and
installation instructions.

Repositories

Are servers that contain software packages. Software packages are retrieved from a repository that can be
hosted in an online or local system. When you use a package manager, you define the location of the
repositories that contain the software packages that the manager can access. This repository information is
typically defined in a package manager configuration file. For example, for the YUM package manager, the
repository information is stored in the /etc/yum.conf file.

Can be: - Online at a vendor site, which the vendor manages


- On an internal server, which your administrators manage
- On the local hard disk drive of the system

The following are examples of Amazon Linux 2 repositories managed by AWS:

- amzn2-core
- amzn2extra-docker
Using the YUM package manager

Install software: yum -y install <package name>


–-y= Assume that yes is the answer to any confirmation prompt
Update software: yum update <package name>
Inventory installed software: yum list installed
Uninstall software: yum remove <package name>

Install software from source code

The following are the typical steps involved in installing software from source code:
1. Download the source code package: Software source code packages are typically compressed archive files
called a tarball.
2. Unarchive the package file: Tarballs usually have the .tar.gz file extension and can be unarchived and
decompressed using the tar command.
3. Compile the source code: A GCC compiler can be used to compile the source code into binary code.
4. Install the software: Once the source code has been compiled, install the software by following the
instructions that are typically included in the package.

File retrieval utilities

wget and curl, are commonly used to download files to a server. Both support the HTTP, HTTPS, and File
Transfer Protocol (FTP) protocols and provide additional capabilities of their own

wget: - Can do a recursive download


- Supports HTTP, HTTPS, and FTP protocols
- Performs retries over an unreliable connection

wget https://www.example.com/mySoftware.zip

curl: - Downloads a single resource only


- Supports HTTP, HTTPS, FTP, and other additional protocols (for example: FTPS and
FILE)
- Runs on more platforms than wget

curl https://www.example.com/mySoftware.zip
CURL example: Installing the AWS CLI

1. Download the AWS CLI installation file using the curl command. The -o option specifies the file name
that the downloaded package is written to (in this case, awscliv2.zip).
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
2. Unzip the installation file. When the file is unzipped, a directory named aws is created under the
current directory.
unzip awscliv2.zip
3. Run the installation program. The installation command uses a file named install in the newly
unzipped aws directory.
sudo ./aws/install
MANAGING LOG FILES

What is logging?

• Logs keep records of events on the system, which helps with auditing.

• The following are types of logs:


- System logs (system startup information and system shutdown times)
- Events logs(user login and logout events)
- Applications logs (startup time, actions, and errors)
- Services logs

The importance of logging

Logging can help troubleshoot issues: What or who caused an error? Did anyone wrongfully access a file, a
database, or a server?
Logs are a key to security audits (gathering information about the system) and service-level agreements
(troubleshooting must start within x hours after an issue occurs).

sudo cat /var/log/YUM.log log file lists programs that were installed or updated. (YUM is a package
management utility to install, update, and remove software.)

Logging levels

Severity Identification Description


Level
0 EMERGENCY Logs messages when the system becomes unstable
1 ALERT Logs when immediate action is needed
2 CRITICAL Logs only messages for critical errors; the system may become unusable
3 ERROR Logs only messages that indicate non-critical error conditions or
more serious messages
4 WARN Logs only messages that are warnings or more serious messages
(usually the default log level on Linux distributions)
5 NOTICE Logs messages for normal events but of significant importance
6 INFO Logs all informational messages and more serious messages
7 DEBUG Logs all debug-level and INFO messages

System Logs

cat, less, more, tail,and head are all commands that are useful to read logs. Using the pipe redirector| and
grep is an efficient way to look for a specific pattern in a log.
You can also open the files using editors such as vi or gedit.

Using grep to search log files

The grep command searches the given file for lines that contain a match to the specified strings or words.
Add the grep command when you look for a specific string of text in log files.
cat yourlog. log | grep ERROR
Important Log files

Log File Description


/var/log/syslog Stores system information
/var/log/secure Stores authentication information for Red Hat-derived distributions
/var/log/kern Stores Linux kernel information
/var/log/boot.log Stores startup messages
/var/log/maillog Stores mail messages
/var/log/daemon.log Stores information about running background services
/var/log/auth.log Stores authentication information for Debian-derived distributions
/var/log/cron.log Stores cron messages for scheduled tasks
/var/log/httpd Stores Apache information for Red Hat-derived distributions

Linux usually stores log files in the /var/log directory

The lastlog command

The lastlog command retrieves user information from the /var/log/lastlog file and outputs it in the console

Reports recent login information for the system


Can report all logins or login information for a specific user

Log rotation

Servers typically run large applications.

- Servers often log every request.


- This logging leads to bulky log files.

Log rotation can help with the following in regard to bulky logs:

- It is a way to limit the total size of the logs that are retained.
- It still helps analysis of recent events.

Log rotation is an automated process that is used in system administration where dated log files are archived
AWS re/start

Networking
INTRODUCTION TO NETW ORKING

What is computer networking?

A computer network is a collection of computing devices that are logically connected together to communicate
and share resources.

How does it work at a basic level?


It has a node. A node is like a computer, router, switches, modems, and printers, which are connected through
links (a way for data to transmit, such as cables), that follow rules to send and receive data.

It has a host. A host is a node that has a unique function. Other devices connect to nodes so they can access
data or other services. An example of a host is a server, because a server can provide access to data, run an
application, or provide a service.

Data and the OSI model

What is Data?
In computing, its bits and bytes, which equal the value of zero or one. Data can be sent over a network and
saved.

There are many types of data: •Character


•Text
•Number
•Media

The OSI model


The Open Systems Interconnection (OSI) model defines a standard for how computers can share information
over a network.

1. Data starts from the source computer. The source computer sends data to the target computer. As the data
leaves the source computer, it is processed and transformed by the different functions in the OSI layers, from
layer 7 down to layer 1. During transmission of data, it can also be encrypted, and additional transmission-
related information, which are called headers, are added to it.

2. Next, the data travels to the application layer (layer 7). This layer provides the interface that enables
applications to access the network services.

3. Next is the presentation layer (layer 6). This layer ensures that the data is in a usable format and handles
encryption and decryption.

4. The data moves to the session layer (layer 5). This layer maintains the distinction between the data of
separate applications.

5. Next is the transport layer (layer 4). This establishes a logical connection between the source and
destination. It also specifies the transmission protocol to use, such as Transmission Control Protocol (TCP).

6. The network layer (layer 3) decides which physical path the data will take. At layer 3(network layer) a
message or data is called a packet. Packets are associated with Internet Protocol (IP) addresses.

7. The data link layer (layer 2) defines the format of the data on the network. At layer 2(the data link layer) a
message or data is called a frame. Frames are associated with a Media Access Control (MAC) address which is
known as a physical address.

8. Finally, the data travels to the physical layer (layer 1), which transmits the raw bitstream over the physical
network.

9. After the data has been transformed through the OSI layers, it is in a package format that is ready to be
transmitted over the physical network.

10. Once the target computer receives the data package, it unpacks the data through the OSI layers, but in
reverse, from layer 1 (physical) through 7 (application).

OSI Model AWS infrastructure


Layer 7 Application (how the end user sees it) Application
Layer 6 Presentation (translator between layers) Web Servers, application servers
Layer 5 Session (session establishment, security) EC2 instances
Layer 4 Transport (TCP, flow control) Security group. NACL
Layer 3 Network (Packets which contain IP addresses) Route Tables, IGW, Subnets
Layer 2 Data Link (Frames which contain physical MAC addresses) Route Tables, IGW, Subnets

Layer 1 Physical (cables, physical transmission bits and volts) Regions, Availability Zones
Networking components

Client

A client is a computer hardware device that allows users to access data and a network.
The client makes the request to the server.

Server

A server provides a response to a request from a client computer over a network.


The server responds to the client's request with the requested content.

Network interface card (NIC)

Connects a computer to a computer network (sometimes referred to as a network adapter).


It uses a cable that is connected to a hub or a switch.
Each NIC has its own media access control (MAC) address. The MAC address is a unique physical (hardware)
identifier that is assigned by the manufacturer. It’s used to identify the sender and receiver of data.
NIC works in layer 2 since it has a MAC address even though it has physical components.
Network cables

Are used to physically connect networks together. Most network nodes are linked together by using some type
of cabling. There are three cables:

Fiber-Optic: Most common cable today; it transmits light instead of electricity.


Coaxial: This is being replaced by fiber-optic, now mainly used to connect cable TV modems to an internet
service provider (ISP).
Twisted-Pair: This is the most common type of computer, telephone, and network cable. Also known as an
Ethernet cable.

Switch

A switch is a device that connects all the nodes of a network together.


•Every hardwired device in the network uses a network adapter or NIC to connect directly to a port on the hub
or switch through a single cable.
•It’s a device that transmits data to only the receiving device using the MAC address
•This device operates at layer 2 of the OSI.

Router

A router is a network device that connects multiple network segments into one network.
•It connects multiple switches and their respective networks to form a larger network (that is, it acts as a
switch between networks).
•It can also filter the data that goes through it, which enables data to be routed differently.
•This device operates at layer 2 and 3 of the OSI.
• A router can filter traffic, while a switch can only switch traffic.

Modem

A modem connects your home to the internet.


There are usually two ports that are used in a modem, one that connects your modem to the outside internet,
and the other to your router if you have one.
An example of a modem, where the coaxial cable connects from internet service provider (ISP) to your
modem. Depending on the modem, you will then have wireless internet or connect it to a router.
NETWORKING CONCEPTS

Types of computer networks

Local-area Network (LAN)

From the standpoint of geographical span, two of the most common types of computer networks are local
area networks (LANs) and wide area networks (WANs).
A LAN connects devices in a limited geographical area, such as a floor, building, or campus.
LANs commonly use the Ethernet standard for connecting devices, and they usually have high data-transfer
rates.

Wide-area Network (WAN)

A WAN connects devices in a large geographical area, such as multiple cities or countries.
WANs use technologies such as fiber-optic cables and satellites to transmit data which are used to connect
LANs.

Network topologies

• A topology is a pattern (or diagram) that shows how nodes connect to each other.
• Computer networks use different topologies to share information.
• The two topologies are: • Physical topology–Refers to the physical layout of wires in the network
• Logical topology–Refers to how data moves through the network.

Physical topologies

Bus topology –positions all the devices along a single cable.


Star topology –every node in the network is directly connected to one central switch.
Mesh topology - complex structure of connections that are similar to peer-to-peer, where the nodes are
interconnected. Mesh networks can be full mesh or partial mesh. A full-mesh topology provides full
redundancy for the network.
Hybrid topology –combines two or more topology structures. The star-bus topology is the most common
hybrid topology.
Logical topologies

Logical topology refers to how data moves through a network.


Amazon Virtual Private Cloud (VPC) is an example of a logical topology:
• A VPC is a virtual network that allows you to launch AWS resources that you define. This VPC looks and works
just like a normal network within a data center with the benefits of using AWS services for scalability.
• Bus, Star, Mesh, and Hybrid topologies all have logical portions as well.

Bus topology: The logical topology and data flow on the network also follows the route of the cable, it moves
in one direction.
Star topology: The logical topology works with the central switch managing data transmission. Data that is
sent from any node on the network must pass through the central switch to reach its destination. The central
switch can also function as a repeater to prevent data loss.
Mesh topology: A mesh topology is a complex structure of connections that are similar to peer-to-peer, where
the nodes are interconnected. Mesh networks can be full mesh or partial mesh.
Hybrid topology: A hybrid topology combines two or more different topology structures.
VPC topology: Is a virtual network that allows you to launch AWS resources that you define. It’s a logical
network.

Network Management models

A network management model is a representation of how data is managed, and how applications are hosted in
a network.
The two most common models for LAN are: •Client-server
•Peer-to-peer

Client-server model

The data management and application hosting are centralized at the server and distributed to the clients.
All clients on the network must use the designated server to access shared files and information that are
stored on the serving computer.

Peer-to-peer model

In this model, each node has its own data and applications and is responsible for its own management and
security.
The peer-to-peer model is a distributed architecture that shares tasks or workloads among peers.

Network Protocols

A network protocol defines the rules for formatting and transmitting data between devices on a network.
It typically operates at layer 3 (Network) and layer 4 (Transport) of the OSI model.
Connection-oriented protocol

• It is a protocol that establishes a connection.


• It waits for a response.
• It creates a session between the sender and the receiver.
• It uses synchronous communication.

Connectionless protocol

• It sends a message from one endpoint to the other, without ensuring that the destination is available and
ready to receive the data.
• It does not require a session between the sender and receiver.
• It uses asynchronous communication.

Examples of network protocols

Internet Protocol (IP)


IP establishes the rules for relaying and routing data in the internet.

Transmission control protocol (TCP)


TCP provides a reliable, connection-oriented, and ordered delivery of bitstreams over an IP network.

TCP/IP
When TCP and IP are combined they form the TCP/IP protocol suite. TCP/IP implements the set of protocols
that the internet runs on.

User Datagram Protocol (UDP)


UDP uses a simple connectionless communication model to deliver data over an IP network. It is unreliable
because it does not guarantee the delivery or ordering of data. It has lower overhead and is faster than TCP.
UDP data flows from one computer to another; there is no SYN or ACK. It cares about speed more than
ensuring data gets to the receiver.

Transmission control protocol (TCP) handshake

TCP handshake comprises of three messages between sender and receiver:


• Synchronize (SYN)
• Synchronize/Acknowledge (SYN/ACK)
• Acknowledge (ACK)

During this handshake, the protocol establishes parameters that support the
data transfer between two hosts.

There is also a process where it gracefully closes the communication between


sender and receiver (similar to saying good-bye to someone) with three
messages:
• Finish (FIN)
• Finish/Acknowledge (FIN/ACK)
• Acknowledge (ACK)

There are also something called reset (RST) flags when a connection closes abruptly and causes and error.
INTERNET PROTOCOLS (IP)

IP is a network protocol that establishes the rules for relaying and routing data in the internet. It uses IP
addresses to identify devices and port numbers to identify endpoints.
Supports subnetting to subdivide a network.

IP is a critical standard within the larger TCP/IP protocol suite when it is combined with the connection-
oriented Transmission Control Protocol (TCP). TCP/IP implements the set of protocols that provides a crucial
service for the internet because it enables the successful routing of network traffic among devices on a
network.

IP addresses

An IP address uniquely identifies a device on a network. Each device on a network has an IP address, and it
serves two main functions: • It identifies a host and a network.
• It is also used for location addressing.

Private and public IP addresses –OSI model

An IP address works in layer 3(networking) of the OSI model. IP addresses can be assigned to devices in a
dynamic or static way. IP addresses can also be made public or private.

Example issues that can happen with IP addresses (layer 3):


• Latency (where a site or application is taking a long time to load, possibly to the point that it times out).
• Unresponsive server
• Dynamic assigned IP addresses that should be statically assigned

Example troubleshooting commands that can be used for layer 3 troubleshooting:


• Ping
• traceroute

Private and public IP addresses

There are certain ranges for private IP addresses located in a guide called RFC 1918.
• 10.0.0.0 –10.255.255.255
• 172.16.0.0 –172.31.255.255
• 192.168.0.0 –192.168.255.255

A private IP address, such as 10.0.0.0, can only be accessed within a logically isolated private network.

With a public IP address such as 54.239.28.85 [amazon.com], anyone can publicly access this over the internet.

EC2 instances have both private and public IP addresses.


• Private IP addresses are used to route traffic within the VPC
• Public IP addresses (when enabled) can be used to interact with the internet.
IP addresses - IPv4

An IPv4 address uniquely identifies a device within a network. This address is made of a 32-bit number, in
decimal digits, separated by periods.

There are two parts to an IPv4 address: • The network portion


• The host portion

IP addresses - IPv6

IPv6 standard extends the range of IPv4 addresses by a factor of 1,028. It uses a group of hexadecimal
numbers that are separated by eight colons (:).

• Increases security
• Handles packets more efficiently
• Improves performance
• The numbers identify both the network and device on the network

Dynamic and static IP addresses

Dynamic IP addresses can change - Useful for devices that leave and come back to a network.
Static IP addresses cannot change - Useful for devices that are connected to often, like printers.

EC2 instances can be assigned a static or dynamic IP address depending on the use case. If the instance is used
as a server, the best address to assign it is a static IP address, also known as an Elastic IP address (EIP).
Otherwise, it will be assigned a dynamic IP address, when the instance is stopped and restarted, the IP address
will change.

IP addresses –special purpose

When a network is assigned a range of IP addresses, such as 10.0.0.0-10.255.255.255, a few addresses have a
special purpose. They are not assigned as host addresses.
• The default router address is typically the second address in the range: 10.0.0.1.
• The broadcast address is the last address in the range: 10.255.255.255
Converting an IP address into binary

To understand IP addressing, you can convert the number into binary. A binary number is expressed in the
base-2 numeral system, and it consists of only zeroes and ones:
• The value of 0 or 1 is known as a binary digit, or bit.
• In an IPv4 address, each of the four numbers between the dots is an 8-bit binary number. This means the
entire address is a 32-bit binary number.
• The following table can be used to convert an 8-bit binary number to a decimal, or a decimal to an 8-bit
binary number:

Port numbers

A port number allows a device in a network to further identify the other devices or applications that
communicate with it. It is also known as an endpoint.

Common Port number examples:

Port 22: SSH (Secure Shell)


Port 53: DNS (Domain Name System)
Port 80: HTTP (Hypertext Transfer Protocol)
Port 443: HTTPS (Hypertext Transfer Protocol Secure)
Port 3389: RDP (Remote Desktop Protocol)

• When a port is blocked by a firewall, or if using a VPC it can be blocked by an AWS service like a Security
Group or Network Access Control List, the source will not be able to send or receive traffic depending on the
rules. Essentially ports can be blocked or allowed to certain traffic for security reasons.
• When troubleshooting issues, you can use commands such as netstat, ss, and telnet. These commands are
used at layer 4 of the OSI, but some can be used at layer 7. These will be covered more into detail in later
sections.
• The command netstat confirms established connections, so if a port is blocked, it will not show as an
established connection.
• The command telnet confirms TCP connections to a web server, note, that this can also be used at layer 7 in
the OSI model.
• The command ss is very similar to netstat, however, it confirms IPv4 connections only.
NETWORKING IN THE AW S CLOUD

Traditional topology AWS service


Data center Amazon VPC
Router Route tables
Switches (subnets) Subnets
Firewall Security groups and network access control lists (network ACLs)
Servers and operating systems Amazon Elastic Compute Cloud (Amazon EC2) instances
Modem Internet gateway

Amazon VPC

Amazon VPC is a service that you can use to provision a logically isolated section of the AWS Cloud.
This service is called a virtual private cloud, or Amazon VPC. With an Amazon VPC, you can launch your AWS
resources in a virtual network that you define.

What does it do?

• Gives you control over your virtual networking resources, including:


– Selecting an IP address range
– Creating subnets
– Configuring route tables and network gateways
• Gives you the ability to customize its network configuration
• Gives you the ability to use multiple layers of security

Why use an Amazon VPC?

• You can spin up a logical environment of what was previously in a data center within minutes in the cloud.
• It is more cost-effective than maintaining equipment in a company data center; you pay for only the
resources that you use.
• It is designed so that companies can migrate and use AWS Cloud services easily.
• It’s secure, scalable, and reliable.
• It works with many innovative AWS and third-party services.
• You can create multiple Amazon VPCs and create test environments before they go live.

Amazon VPC features

• It is dedicated to an AWS account.


• It belongs to a single AWS Region.
• It can span multiple Availability Zones.
• It is logically isolated from other Amazon VPCs.
• You can create multiple Amazon VPCs in an AWS account to separate networking environments.
• You can create subnets in a VPC; however, fewer subnets are recommended to reduce the complexity of the
network topology.
IP addressing in Amazon VPC

When you create a VPC, you must specify the IPv4 address range by choosing a CIDR block, such as
10.0.0.0/16.
• An Amazon VPC address range could be as large as /16 (65,536 addresses) or as small as /28 (16 addresses).
• Private IP ranges should be used according to RFC 1918.

Within each subnet CIDR block, AWS reserves the first four IP addresses and the last IP address:
• 10.0.0.0 – Network address
• 10.0.0.1 – VPC router
• 10.0.0.2 – Domain Name System (DNS) server
• 10.0.0.3 – Reserved for future use
• 10.0.0.255 – Network broadcast, not supported in the VPC, but still reserved

Important concepts within the VPC

CIDR block: A private range should be given from /16–/28.


Subnets: Allocate a range of IP addresses within your VPC.
Route table: Rules (also known as routes) that the VPC uses to route traffic.
Internet gateway: Attaches to your VPC and permits communication from your VPC to the internet.
VPC endpoint: A private connection between AWS services without the need of the internet.
NAT gateway: Permits instances in the private subnet to connect outside the VPC.
Security group: It is a firewall at the EC2 instance level that controls incoming traffic.

Common ways to access Amazon VPC:


•The AWS Management Console
•AWS Command Line Interface (AWS CLI)

To determine the CIDR range, you can use the following third-party calculator: https://www.subnet-
calculator.com/
To determine the recommended range of private IP addresses that you can use, you can refer to the following
guide: https://datatracker.ietf.org/doc/html/rfc1918.

A Virtual Private Gateway (VPC) is like a data center but in the cloud. It’s logically isolated from other virtual
networks from which you can spin up and launch your AWS resources within minutes.

Private Internet Protocol (IP) addresses are how resources within the VPC communicate with each other. An
instance needs a public IP address for it to communicate outside the VPC. The VPC will need networking
resources such as an Internet Gateway (IGW) and a route table in order for the instance to reach the internet.

An Internet Gateway (IGW) is what makes it possible for the VPC to have internet connectivity. It has two jobs:
perform network address translation (NAT) and be the target to route traffic to the internet for the VPC. An
IGW's route on a route table is always 0.0.0.0/0.

A subnet is a range of IP addresses within your VPC.

A route table contains routes for your subnet and directs traffic using the rules defined within the route table.
You associate the route table to a subnet. If an IGW was on a route table, the destination would be 0.0.0.0/0
and the target would be IGW.

Security groups and Network Access Control Lists (NACLs) work as the firewall within your VPC. Security
groups work at the instance level and are stateful, which means they block everything by default. NACLs work
at the subnet level and are stateless, which means they do not block everything by default.
INTRODUCTION TO IP SUBNETTING

IP address

Networking is how you connect computers and other devices around the world so that they can communicate
with one another. Each one has an IP address so that traffic (data packets) can be directed to and from each
device. The internet uses these IP addresses to deliver a packet of information to each address it is routed to.

Subnetting

Subnetting is the technique for logically partitioning a single physical network into multiple smaller
subnetworks or subnets.
Organizations use subnets to divide large networks into smaller, more interconnected networks to increase
speed, minimize security threats, and reduce network traffic.

Organizations use subnets to:


• Minimize traffic. Using subnets, traffic takes the most efficient routes, increasing network speeds.
• Maximize the efficiency of IP addressing.
• Reduce network traffic by eliminating collision and broadcast traffic.
• Provide the efficient application of network security policies at the interconnection between subnets.
• Facilitate spanning of large geographical distances (especially for the needs of AWS).
• Prevent the allocation of large numbers of unused IP network addresses.

IP subnetting is a method for dividing a single, physical network into smaller subnetworks (subnets).
Subnetting in an IPv4 address gives you 32 bits to divide into two parts: a network ID and a host ID. Depending
on the number of bits you assign to the network ID, subnetting provides either a greater number of total
subnetworks or more hosts. (Hosts are devices that can be part of each subnet.)

Classes that you can or cannot use in a subnet

Each IP address belongs to a class of IP addresses depending on the number in the first octet.

Standard IPv4 address classes have three network ID sizes: 8 bits for Class A (which allows for more hosts), 16
bits for Class B, and 24 bits for Class C (which can have more subnetworks). However, in many cases, standard
sizes do not fit all. With subnetting, you can have more control over the length of the network ID portion of an
IP address. Your options go beyond the bounds of the standard 8-bit, 16-bit, or 24-bit lengths. Therefore, you
can create more Host IDs for host devices per subnetwork.

The opposite of subnetting is supernetting, where you combine two or more subnets to create a single
supernet. You can refer to this supernet by using a CIDR prefix.
First octet Class Example IP IPv4 bits for network ID sizes
value address
0–126 Class A 34.126.35.125 8
128–191 Class B 134.23.45.123 16
192–223 Class C 212.11.123.3 24
224–239 Class D 225.2.3.40 Used for multicast and cannot be used for regular internet traffic
240–255 Class E 245.192.1.123 Reserved and cannot be used on the public internet

What are the parts of a subnet?

A 32-bit IP address uniquely identifies a single device on an IP network. The subnet mask divides the 32 binary
bits into the host and network sections, but they are also broken into four 8-bit octets. Each subnet is a
network inside a network and contains the following parts:
• Network ID: This portion of the IP address identifies the network and makes it unique.
• Subnet mask: A subnet mask defines the range of IP addresses that can be used within a network or subnet.
It also separates an IP address into two parts: network bits and host bits.
• Host ID range: This range consists of all of the IP addresses between the subnet address and the broadcast
address. To calculate, take the number of usable host IP addresses within the subnet minus the first and last.
• Number of usable host IDs: This number depends on the class and prefix of subnet. Depending on the CIDR,
it can run between 30 and 254. It is always minus the broadcast ID and the first character of the IP address
(minus 2).
• Broadcast ID: This IP address is used to target all systems on a specific subnet instead of a single host. It
permits traffic to be sent to all devices on a specific subnet rather than a specific host.

Subnet Masks

Subnet masks split IP addresses into host and network sections based on four 8-bit octets. In this way, it
defines which part of the IP address belongs to the device and which part belongs to the network. It also
covers up the subnet so that it isn’t seen outside of allowed traffic.

A subnet mask is a 32-bit number created by setting host bits to all 0s and setting network bits to all 1s. In this
way, the subnet mask separates the IP address into the network and host addresses.
The 255 address is always assigned to a broadcast address, and the 0 address is always assigned to a network
address. Neither one can be assigned to hosts because they are reserved for these special purposes.

Why are subnet masks used?

A subnet mask uses its own 32-bit number to mask the IP address and further enable the subnetting process.

Subnet masks:
• Determine which hosts are on the local network and which hosts are outside the network. Hosts can talk
directly to hosts on the same network, but they must communicate with a router to talk to hosts on external
networks.
• Hide network size information for IPv4 addresses.
• Are used for special purposes:
• Class DIPv4 addresses are used for multicast addressing.
• In computer networking, multicast refers to group communication where information is addressed
to a group of destination computers simultaneously. For example, multicast addressing is used in
internet television and multipoint video conferences.
• Class E IPv4 addresses cannot be used in real applications because they are used only in
experimental ways.

CIDR

An IP addressing scheme that improves the allocation of IP addresses. Using CIDRs, you can create not only
subnets but also supernets. CIDRs are used at the internet service provider (ISP) level and higher.

The general rule is that subnets are used at the organizational level, but CIDRs are used at the internet service
provider (ISP) level and higher.

• Subnets: When you place a mask over the subnet, you instantly create an entire subnetwork that is a
subordinate network of the internet. The subnet mask signals to the router which part of the IP address is
assigned to the hosts (individual participants of the network). It also signals which address determines the
network.
• CIDRs: This scheme adds suffixes and then integrates them directly into the IP address. Using CIDRs, you can
create not only subnets but also supernets. In addition, you can use CIDR to subdivide a network into several
networks.
ADDITIONAL NETWORKING PROTOCOLS

Transport, application, management, and support protocols

Transport Protocol Application Protocol Management & Support


Protocol
Transmission Control Protocol Hypertext Transfer Protocol (HTTP) Domain Name System
(TCP) (DNS)
User Datagram Protocol (UDP) Secure Sockets Layer (SSL) and Transport File Transfer Protocol
Layer Security (TLS)
Mail Protocols: Dynamic Host
 Simple Mail Transfer Protocol (SMTP) Configuration Protocol
 Post Office Protocol (POP) (DHCP)
 Internet Message Access Protocol (IMAP)
Remote Desktop Protocols: Internet Control Message
 Remote Desktop Protocol (RDP) Protocol (ICMP)
 Secure Shell

A communication protocol is a system of rules. These rules permit two or more entities of a communications
system to transmit information through any variation of a physical quantity. The different types of
communication protocols include transport, application, management, and support protocols.

Transport protocols run over the best-effort IP layer to provide a mechanism for applications to communicate
with each other. The two general types of transport protocols are a connectionless protocol (User Datagram
Protocol) and a connection-oriented protocol (Transmission Control Protocol).

Application protocols govern various processes, from downloading a webpage to sending an email. Examples
include HTTP, SSL, TLS, mail protocols (SMTP, POP, and IMAP), and remote desktop protocols (RDP and SSH).

Management protocols are used to configure and maintain network equipment. Support protocols facilitate
and improve network communications.

Transport protocols

TCP

TCP/IP is a connection-oriented protocol. It defines how to establish and maintain network communications
where application programs can exchange data. Data that is sent through this protocol is divided into smaller
chunks called packets.

In terms of the OSI model, TCP is a transport-layer protocol. It provides reliable virtual-circuit connection
between applications; that is, a connection is established before data transmission begins. Data is sent without
errors or duplication and is received in the same order as it is sent. No boundaries are imposed on the data;
TCP treats the data as a stream of bytes.

UDP

The UDP uses a simple, connectionless communication model to deliver data over an IP network. Compared to
TCP, UDP provides only a minimum set of functions. It is considered to be unreliable because it does not
guarantee the delivery or ordering of data. Its advantages are that it has a lower overhead, and it is faster than
TCP.
In terms of the OSI model, UDP is also a transport-layer protocol and is an alternative to TCP. It provides an
unreliable datagram connection between applications. Data is transmitted link by link; there is no end-to-end
connection. The service provides no guarantees. Data can be lost or duplicated, and datagrams can arrive out
of order.

Because TCP and UDP use ports for communication, most layer 4 transport problems revolve around ports
being blocked. When troubleshooting layer 4 communications issues, first make sure that no access lists or
firewalls are blocking TCP/UPD ports. Remember that the transport layer controls the reliability of any given
link through flow control, segmentation and desegmentation, and error control. Some protocols can keep
track of the segments and retransmit the ones that fail. The transport layer acknowledges successful data
transmission and sends the next data if no errors have occurred. The transport layer creates packets from the
data that it receives from the upper layers.

TCP vs. UDP

Basis for comparison TCP UDP


Definition TCP establishes a virtual circuit before UDP transmits the data directly to
transmitting the data. the destination computer without
verifying receiver is ready.
Connection type Connection-oriented protocol Connectionless protocol
Speed Slow High
Reliability It is a reliable protocol It is an unreliable protocol
Header size 20 bytes 8 bytes
Acknowledgement It waits for acknowledgement of data It neither takes the
and has the ability to resend the lost acknowledgement nor retransmits
packets the damaged frame

Network Protocols

A network protocol defines the rules for formatting and transmitting data between devices on a network. It
typically operates at layer 3 (network) or layer 4 (transport) of the OSI model.

Connection-oriented protocol Connectionless protocol


Establishes a connection and waits for a response Sends a message from one endpoint to the other
without ensuring that the destination is available and
ready to receive the data
Creates a session between the sender and the Does not require a session between the sender and
receiver the receiver
Uses synchronous communication Uses asynchronous communication
TCP handshake

TCP is great for transferring important files because connection is guaranteed even though it has a larger
overhead (time). It is connection oriented.

TCP has something that is called the TCP handshake. This handshake comprises three messages:
• Synchronize (SYN)
• Synchronize/Acknowledge (SYN/ACK)
• Acknowledge (ACK)

During this handshake, the protocol establishes parameters that support the data transfer between two hosts.
For example:
• Host A sends a SYN packet to Host B.
• Host B sends the SYN with an ACK attached to acknowledge that they received it with the message
back to Host A.
• Host A sends the last message with ACK to Host B informing them that they received the SYN/ACK
message.

Another process gracefully closes the communication between the sender and receiver (similar to saying
goodbye to someone) with three messages: •Finish (FIN)
•Finish/Acknowledge (FIN/ACK)
•Acknowledge (ACK)

There are also flags called reset (RST) flags when a connection closes abruptly and causes an error.

Application protocols

HTTP
HTTP is the protocol that is used to reach webpages. A full HTTP address is expressed as a uniform resource
locator (URL).
Secure Hypertext Transfer Protocol (HTTPS) is a combination of HTTP with the SSL/TLS protocol.

SSL and TLS

SSL is a standard for securing and safeguarding communications between two systems by using encryption.

TLS is an updated version of SSL that is more secure. Many security and standards organizations—such as
Payment Card Industry Security Standards Council (PCI SSC)—require organizations to use TLS version 1.2 to
retain certification.

A TLS handshake is the process that initiates a communication session that uses TLS encryption. During a TLS
handshake, the two communicating sides exchange messages to acknowledge each other and verify each
other. They establish the encryption algorithms that they will use, and agree on session keys. TLS handshakes
are a foundational part of how HTTPS works.
SSL/TLS creates a secure channel between a user’s computer and other devices as they exchange information
over the internet. They using three main concepts—encryption, authentication, and integrity—to accomplish
this result. Encryption hides data that is being transferred from any third parties. Without SSL/TLS, data gets
sent as plain text, and malicious actors can eavesdrop or alter this data. SSL/TLS offers point-to-point
protection to ensure that the data is secure during transport.

To provision, manage, and deploy public and private SSL/TLS certificates for use with AWS services and internal
connected resources, you need AWS Certificate Manager (ACM).

Remote Desktop protocols (RDP and SSH)

RDP and SSH are both used to remotely access machines and other servers. They’re both essential for securely
accessing cloud-based servers, and they also aid remote employees in using infrastructure on premises.

RDP is a protocol that is used to access the desktop of a remote Microsoft Windows computer. Use port 3389
with clients that are available on different operating systems.

SSH is a protocol that opens a secure command line interface (CLI) on a remote Linux or Unix computer. The
standard TCP port for SSH is 22.

Application Protocol Port numbers

Application protocol Transport protocol Port number


HTTP TCP 80
HTTPS TCP 443
FTP TCP 21
SSH TCP 22
DNS TCP 53

Management and support protocols

Management protocols are used to configure and maintain network equipment.


Support protocols enable and improve network communications.

Here are the examples of management and support protocols:


• DNS (Domain Name System)
• ICMP (Internet Control Message Protocol)
• DHCP (Dynamic Host Configuration Protocol)
• FTP (File Transfer Protocol)

DNS
DNS is a database for domain names. It is similar to the contacts list on a mobile phone. The contacts list
matches people’s (or organization’s) names with phone numbers. DNS functions like a contacts list for the
internet.
DNS translates human-readable domain names (for example, www.amazon.com) to machine-readable IP
addresses (for example, 192.0.2.44). DNS servers automatically map IP addresses to domain names.

ICMP
Network devices use ICMP) to diagnose network communication issues and generate responses to errors in IP
networks. A good example is the ping utility, which uses an ICMP request and ICMP reply message. When a
certain host or port is unreachable, ICMP might send an error message to the source.
DHCP
DHCP automatically assigns IP addresses, subnet masks, gateways, and other IP parameters to devices that are
connected to a network.
Some examples of DHCP options are router (default gateway), DNS servers, and DNS domain name.

FTP
FTP is a network protocol that authorizes the transfer of files from one computer to another. FTP performs two
basic functions: PUT and GET. If you have downloaded something such as an image or a file, then you probably
used an FTP server.

Network Utilities

When you work with networks, it is important to check network performance, bandwidth usage, and network
configurations.
The following list contains a few common network utilities that you can use to quickly troubleshoot network
issues. These tools can help ensure uninterrupted service and prevent long delays.

Example of common network utilities include:


• ping tests connectivity. This tool tests whether the remote device (server or desktop) is on the network.
• nslookup queries the DNS and its servers. It shows the IP addresses that are associated with a given domain
name.
• traceroute permits users to see the networking path that is used. It is helpful for troubleshooting
connectivity problems.
• telnet is used for service response. This tool tests whether the service that runs on the remote device is
responding to requests.
• hping3 is a command line-oriented TCP/IP packet assembler and analyser that measures end-to-end packet
loss and latency over a TCP connection.
ADDITIONAL NETWORKING TECHNOLOGIES

Wireless Technologies

Examples of wireless technologies

• Wired Equivalent Privacy (WEP): WEP offers wireless protection and added security to wireless networks by
encrypting data.
• Wi-Fi Protected Access (WPA): WPA was introduced as a replacement for WEP. Although WEP and WPA
share similarities, WPA offers improvements in handling security keys and user authorization. Has a 256-bit
key.
• Bluetooth Low Energy (BLE): BLE optimizes energy consumption. You might also hear Bluetooth Low Energy
referred to as BLE, Bluetooth LE, or Bluetooth Smart. BLE technology is primarily used in mobile applications,
and it is suitable for IoT. It was initially developed for the periodic transfer of small chunks of data over short
ranges. BLE is used in solutions that span a range of domains, including healthcare, fitness, beacons, security,
and home entertainment.
• 5G cellular systems: The complete rollout of this technology will take a couple of years. It will eventually
provide download speeds up to 10 Gbps.

Internet of Things (IoT)

The Internet of Things refers to physical devices, or things, that are connected to the internet so that they can
share and collect data.
The primary goal of IoT is for devices to self-report in real time, which can improve efficiency. It can also
surface important information more quickly than a system that depends on human intervention.

What is IoT?

• Is about the ability to transfer data over a network without requiring human-to-human or human-to-
computer interaction
• Is about expanding product capabilities (usage)
• Is about creating value from generated data (analysis)

What does IoT do?

• Connects physical items that have built-in connectivity capabilities


• Generates enormous amounts of data

Examples • Smartphones
• Wearables (smart watches, smart glasses, and others)
• Connected cars
• Thermostats

How IoT devices communicate

A device can be thought of as a combination of sensors and actuators that gather vast amounts of information
from its environment. An example is a temperature sensor that captures the temperature-related details from
a living room (which is its environment).
These devices are purpose built and do not come with many compute abilities. For these devices to
communicate, they use lightweight protocols such as MQ Telemetry Transport (MQTT), which do not have a
big footprint on the device.
The data from the device is typically sent to a device gateway, which then pushes this data onto the cloud-
based platform by using the internet (HTTPS). As soon as the data gets to the cloud, software performs some
kind of processing on it.
This processing could be very simple, such as checking that the temperature reading is within an acceptable
range. It could also be complex, such as using computer vision on video to identify objects (such as intruders in
your house).
This data is processed in the platform, and actions are run in a rule-based fashion. Among many other
offerings, the platform mainly provides services for data management, analytics, support enhancements of the
IoT application, and a user interface.
The user interface, typically a mobile device or a computer, is your window to the IoT world.

AWS IoT Core

What is AWS IoT Core?

• You can use AWS IoT Core to connect billions of IoT devices and route trillions of messages to different AWS
services.
• With AWS IoT Core, users can choose their preferred communication protocol.
• Communication protocols include MQTT, Secure Hypertext Transfer Protocol (HTTPS), MQTT over
WebSocket Secure (WSS), and long-range wide area network (LoRaWAN).

Devices communicate with cloud services

Devices communicate with cloud services by using various technologies and protocols. Examples include:
• Wi-Fi and broadband internet
• Broadband cellular data
• Narrow-band cellular data
• Long-range wide area network (LoRaWAN)
• Proprietary radio frequency (RF) communications

Enterprise Mobility

Enterprise mobility is a growing trend for businesses. This approach supports remote-working options, which
use your personal laptops and mobile devices for business. Remote workers can connect and access data
through cloud technology.

Two specific solutions enable enterprise mobility:


• Bring your own device (BYOD) is the use of a personal device, such as a mobile phone or tablet, on a public
or private network. This solution includes the use of a personal device on a corporate network.
• Mobile device management (MDM) is a term that describes the management of mobile devices, but it also
applies to laptop and desktop computers. Organizations use MDM to provide devices with settings, software,
and access to data in a secure way that complies with their needs.
Amazon WorkSpaces

Customers can use Amazon WorkSpaces to provision virtual, cloud-based Microsoft Windows or
Amazon Linux desktops, known as WorkSpaces, for their users.

Amazon WorkSpaces eliminates the need to procure and deploy hardware or install complex
software. Customers can quickly add or remove users as their needs change. Users can access
their virtual desktops from multiple devices or web browsers.

Amazon WorkSpaces is:

• Simple to manage: Customers can deploy and manage applications for their WorkSpaces by using Amazon
WorkSpaces Application Manager (Amazon WAM). They can also use the same tools to manage WorkSpaces
that they use to manage on-premises desktops.

• Secure: Amazon WorkSpaces uses either AWS Directory Service or AWS Managed Microsoft AD to
authenticate users. Customers can add multi-factor authentication (MFA) for additional security. They can use
AWS Key Management Service (AWS KMS) to encrypt data at rest, disk I/O, and volume snapshots. Customers
can also control the IP addresses of users that are allowed to access their WorkSpaces.16

• Scale consistently: Customers can increase the size of the root and user volumes for a WorkSpace, up to
1000 GB each. They can expand these volumes whether they are encrypted or unencrypted. Customers can
request a volume expansion one time in a 6-hour period.

To ensure that data is preserved, customers cannot decrease the size of the root or user volumes after they
launch a WorkSpace

Amazon WorkSpaces use cases

The use cases for Amazon WorkSpaces are nearly endless, but some of the most common use cases include:

• Employees who work from home or remote employees


• Contract workers
• Employees who work from multiple devices, including personal devices
• Developers after a merger and acquisition
CREATING NETWORKING RESOURCES IN AN AMAZON VIRTUAL PRIVATE CLOUD (VPC)

- Creating the VPC


- Creating Subnets
- Create Route Table
- Create Internet Gateway and attach Internet Gateway
- Add route to route table and associate subnet to route table
- Creating a Network ACL
- Creating a Security Group
- Launch EC2 instance and SSH into instance
- Use SSH to connect to an Amazon Linux EC2 instance
- Use ping to test internet connectivity
AWS re/start

Security
INTRODUCTION TO SECU RITY

Security Basics
Security is the practice of protecting valuable assets.

• Assets can be physical or digital and include people, buildings, computers, software applications, and data.
• Cybersecurity is concerned with protecting networks, devices, systems, and digital information from the
following:
– Unauthorized access
– Malicious modification, theft, or destruction
– Disruption of intended use

• The primary goal of cybersecurity is to ensure the confidentiality, integrity, and availability of digital
information.

Confidentiality, integrity, and availability (CIA)

• Confidentiality protects the privacy of the information by preventing unauthorized access to it. A common
method to ensure confidentiality, for example, is to first ask users to identify themselves before they are
allowed to use a system. This process is known as authentication.
• Integrity ensures that the information is always accurate and correct where it is stored and whenever it is
moved. The data cannot be altered by unauthorized users as it moves inside and outside its containing system
or when it reaches its final storage location. Hashing is an example of a technique that can be used to ensure
that data has not been tampered with during transit.
• Availability ensures that the information is accessible to users when they need it. Businesses typically
address availability requirements by creating plans such as a business continuity plan (BCP) and a disaster
recovery plan (DRP). These plans define processes and procedures to maintain or quickly restore the
availability of the systems containing the information in the event of failure or disruption.

Basic Security Terms


Types of Threat

Threat type Description CIA attribute affected

• Malicious software designed to disrupt the • Confidentiality (virus, spyware, or


operation of a computer system, gain RAT)
unauthorized access to it, or collect sensitive • Integrity (virus or RAT)
Malware
information from it • Availability (virus, worm, or RAT)
• Examples: Virus, spyware, worm, remote
access Trojan (RAT)

• Malicious code that restricts access to a • Availability


Ransomware
computer or its data until a ransom is paid

Denial of service • Attack that prevents authorized users from • Availability


(DoS) accessing a system

• Attack in which the attacker intercepts the • Confidentiality


Man-in-the- communication between two parties and
middle (MitM) impersonates one of the parties or modifies the • Integrity
communication between them

• Technique in which the attacker masquerades • Confidentiality


as a legitimate person or business and uses
Phishing
email or a website to get personal information,
such as passwords or credit card numbers

• Technique in which the attacker uses human • Confidentiality


interaction to manipulate a person into
Social
revealing sensitive information or breaking
engineering
security procedures to gain access to systems
or information

Some examples of malware include the following:

• Virus: A program that can corrupt or delete data and propagate itself from one system to another.
• Spyware: Code that secretly gathers information on a system and reports it to the attacker.
• Worm: A program that spreads itself and consumes resources destructively on a computer.
• Remote access Trojan (RAT): A software tool used to gain unauthorized access to a computer in order to
control it.
Security Strategy

AWS Cloud security shared responsibility model

What are security controls?

Security controls are measures that protect against threats and eliminate vulnerabilities. There are three types
of security controls: preventive, detective, and corrective. For each type of control, you can implement
physical, technical, and administrative security measures to ensure information confidentiality, integrity, and
availability.

A preventative security control protects a system from security threats before they can happen. A detective
security control helps find a vulnerability early or quickly alert when a breach has happened. A corrective
security control remediates a security breach.

Each type of control provides protection in three different security areas: physical, administrative, and
technical. A physical control is a device or object, such as a security camera. An administrative control is
usually a policy or a procedure that must be followed. Finally, a technical control is usually some software that
provides security functions.
Security lifecycle

An effective security strategy addresses security in stages of a lifecycle. These stages consist of prevention,
detection, response, and analysis. Note that the first three stages correspond to the three types of security
controls.

In the prevention stage, you identify the assets to be protected, assess their vulnerabilities, and implement
measures to remove any discovered vulnerability.

In the detection stage, you implement monitoring solutions to quickly identify and generate alerts if a breach
is detected.

In the response (or corrective) stage, you perform the corrective tasks to eliminate the breach and restore
normal operations.

Finally, in the analysis stage, you review the steps used to resolve the issue and identify any lessons learned. If
necessary, you update your security policies and procedures to make adjustments based on the result of the
analysis.
SECURITY LIFECYCLE: PREVENTION

Prevention in the security lifecycle


Prevention stops threats before they happen.

• Prevention is the first phase of the security lifecycle.


• During this phase, you have the opportunity to take proactive action to defend against threats.
• Prevention tasks include the following: – Identifying assets to be protected
– Assessing asset vulnerability
– Implementing countermeasures

Prevention strategy

Layered security model


An effective security prevention strategy uses a layered defence model.

• Each layer offers a different level of defence for the assets


• Perimeter security: Secures the perimeter networks by using controls such as firewalls or an intrusion
prevention system (IPS)
• Network security: Prevents unauthorized network access by using network access control lists (network
ACLs), for example
• Endpoint security: Uses software such as an antivirus program to protect a host
• Application security: Protects applications with specialized firewalls, and monitoring and scanning tools
• Data security: Protects access to data through identity and access management

Layered defence example: OSI model

• Physical layer: Network devices and equipment are protected from physical access to keep intruders out.
• Data link layer: Filters are applied to network switches help prevent attacks based on media access control
(MAC) addresses.
• Network and transport layers: Implementing firewalls and access control lists (ACLs) helps to mitigate
unauthorized access to internal systems.
• Session and presentation layers: By using authentication and encryption methods, you can prevent
unauthorized data accesses.
• Application layer: Solutions, such as virus scanners and an IDS, help protect applications.
Types of prevention measures

Network hardening measures


Implement controls to stop threats at the network level.

• Network discovery hardening


– Block network exploration protocols.
– Close unused ports.
– Maintain an accurate and up-to-date asset inventory that identifies the list of the devices that are
allowed on your network.
• Network security architecture hardening
– Use firewalls.
– Use an intrusion prevention system (IPS).
– Segment your network.

Systems hardening measures


Implement controls to stop threats at the host level.

• Hosts include workstations, servers, or other devices that run services and applications or store data.
• Examples of systems hardening measures include the following:
– Apply operating system (OS) patches and security updates regularly.
– Remove unused applications and services.
– Monitor and control configuration changes.

Data security controls


Implement controls to protect the data.

• Encrypt data in transit and data at rest as needed.


• Use digital certificates to protect information.
• Use data integrity checking tools.
• Use role-based access control.

Identity management
Implement controls for user authentication and authorization.

• Use the principle of least privilege to control access to resources.


• Set up a policy that enforces password strength and password expiration.
• Use the principles of authentication, authorization, and accounting (AAA)
PREVENTION: NETWORK HARDENING

Network hardening is the activity in the layered security prevention strategy that focuses on protecting the
network.
• The goal is to stop the unauthorized access, misuse, modification, or destruction of a computer network and
its resources. Network hardening combines policies and procedures with hardware and software solutions to
achieve this goal.

Network security threats

• A network security threat is any attempt to expose, alter, disable, or gain unauthorized access to an
organization’s network. Its purpose is to steal data or perform a malicious activity.
• Network attacks start by discovering information about a network and then exploiting a vulnerability in the
network.
• Types of network security discovery threats include the following:
– Network mapping
– Port scanning
– Traffic sniffing

Network mapping
Network mapping exposes the topology of a network.

• Attackers can use it to find out which devices and hosts are present in the network.
• Examples of network mapping commands and tools include the following:
– ping: Determines the IP address of a host
– traceroute: Identifies the network path and devices that a message traverses to reach a host
destination
– Nmap: Discovers which hosts are on a network

Port scanning
Port scanning exposes the available protocols and services in a network.

• Port scanning sends packets sequentially to ports on a host to determine which ports are open.
• Attackers can use it to find out which protocols and services are implemented on the network.
• Nmap is an example of a port scanning tool and does the following:
– Determines which protocols are supported by a host
– Determines which ports are open on a host
– Identifies which services are connected to open ports
Traffic sniffing
Traffic sniffing exposes the information that is traveling through a network.

• Traffic sniffing reads the data in all of the packets that pass through a network interface card (NIC) or
network device.
• Attackers can use it to read any unencrypted data that is passing through a network.
• Wireshark is an example of a traffic sniffing tool.
– It captures network traffic data for multiple protocols.
– You can use it to interactively browse the data.
– It saves the data in multiple formats.

Network discovery hardening

Preventing network discovery


The goal of network discovery hardening is to keep attackers off the network.

• To protect against network mapping and port scanning, restrict the use of, or disable, network discovery
protocols.
– Internet Control Message Protocol (ICMP)
– Simple Network Management Protocol (SNMP)
• To protect against traffic sniffing, consider the following measures:
– Disable promiscuous mode on NICs
– Use switches instead of hubs in a network
– Encrypt sensitive data in transit
•In the AWS Cloud, use the Amazon Inspector service to discover unintended network exposure vulnerabilities

Other network discovery countermeasures

•Monitor the network for suspicious activities:


– Record the traffic that is entering the network.
– Watch for unknown IP addresses.
– Monitor for ports that are being scanned sequentially.

•Limit remote administration access:


– Limit protocols that are used for remote administration.
– Limit locations from where remote administration can be done.
– Implement an authentication, authorization, and accounting (AAA) policy to limit who can access
network devices.

Network architecture hardening


Network architecture hardening increases the security of a network through design improvements. Network
architecture hardening is achieved through the following:

• Adding security components to the network


– Network firewalls
– Intrusion prevention systems (IPSs)

• Segmenting a network
– Creating private subnets
– Using network access control lists (network ACLs)
Network firewall
A network firewall is a protection mechanism to filter incoming and outgoing traffic in a network.

• A network firewall allows the following:


– Some packets to pass and blocks others based on the following:
» Source and destination IP address
» Source and destination port number
» Access protocol type
– Only authorized access inside the network.
• A network firewall can be a hardware device or installed as software.

Network firewall best practices

• When configuring the firewall, consider the following:


– Start by explicitly denying all traffic and then permit only needed traffic.
– Block traffic that is directed to network control devices unless it originates from a trusted network.
– Log all exceptions.
• Place the firewall as close to the traffic source as possible.
– Internet boundary
– Internal network segment boundary
• Supplement network firewalls with application firewalls.

AWS security groups


In the AWS Cloud, a security group implements a firewall to protect EC2 instances.

• Security groups:
– Act like a built-in firewall for instances
– Are associated with network interfaces on an instance
– Define allow rules that determine the access to an instance
» Inbound traffic rules
» Outbound traffic rules
– Are stateful (if you allow a certain type of traffic into an instance, the same type of traffic is allowed
out of the instance).

• Security group rules are based on: – Protocol


– Port number
– Source and destination IP address

Intrusion prevention system (IPS)


An IPS actively protects a network against threats.

• Monitors network traffic, detects threats, and automatically defends against them
• Uses different types of threat detection mechanisms including the following:
– Anomaly-based detection: The IPS compares the current traffic pattern against established
baselines for any deviation.
– Signature-based detection: The IPS monitors and analyzes the traffic for known patterns of attack.
• Can be a hardware or software solution
• Is usually placed behind a network firewall
Segmenting a network
You can use network segmentation to apply different security controls to different parts of a network.

• Segmenting creates multiple smaller logical networks from a large network. Each logical network is called a
subnet.
• Each subnet is assigned a contiguous subset of the IP addresses of the large network.
• Each subnet can be configured with its own security controls to meet the requirements of the different types
of resources in the network.
• Classless Inter-Domain Routing (CIDR) notation is used to specify subnet IP address ranges. This notation
provides a shorthand for describing the size of a network.
• Other benefits of segmentation include the following:
– Easier network management
– Improved network performance

Network access control list (network ACL)


A network ACL acts like a firewall. In the AWS Cloud, it is used to protect a subnet.

• A network ACL:
– Filters inbound and outbound network traffic.
– Is typically implemented in a switch or router.
– Is stateless (inbound and outbound rules are independent of each other).
• In the AWS Cloud, a network ACL:
– Is associated with a subnet in a VPC.
– Allows or denies traffic in and out of a subnet based on rules.
– Hardens security as a secondary level of defence at the subnet level.
PREVENTION: SYSTEMS HARDENING

Systems hardening is the action of securing computing systems to protect them from attacks.

• Reduce the number of running services on a system.


• Use tools to accomplish systems hardening.
• Secure computing with the goal of making them hack-proof.

Authentication, authorization, and accounting

Physical security
Physical security is essential to systems hardening.

• Restrict physical access to facilities.


• Design buildings against natural or manmade disasters.
• Make physical security the base of all other security principles

Security baselines
A baseline defines the expected conditions of a system.

• Provides a starting point for determining what and how to secure


• Is updated to reflect changes that are made on systems
• Includes enhancements
• Relies on updated documentation about your system

How to harden systems

• Turn off unnecessary services.


• Control computer operations through group policies.
• Regularly apply patches and updates.

Patching

A patch:
• Is applied on a system where a weakness was discovered
• Fixes a performance or feature issue
• Reduces the types of methods that can infiltrate the system
• Makes a system more reliable and secure
• Comes as an update for the software or as part of a collection of updates (service pack)
Common systems hardening recommendations

Client Server
Turn on antivirus and firewalls Restrict physical access
Run fewer applications Use dedicated roles
Apply updates when they are released Secure file systems
Limit removable media Use encryption and PKI
Control downloads Use alerts
Restrict terminal services Apply updates when they are released
Monitor the environment Limit administrative access

Software application hardening

Examples of application hardening:


Patching standard and third-party applications automatically
Using application firewalls
Using antivirus, malware, and spyware protection applications
Using software-based data encryption
Using CPUs that support Intel Software Guard Extensions (SGX)Using a specific application that manages and
encrypt passwords for improved password storage

Training and education


Training and education are the most effective defences to harden systems against social engineering and
phishing attacks.

• Train employees on policies, and then enforce the policies.


• Policies always come first.
• Get management on board.
• Enforce consequences for noncompliance.

Systems hardening tools

AWS Tools

Trusted Advisor provides recommendations that help you follow AWS best practices. Trusted Advisor
evaluates your account by using checks.
GuardDuty is a threat detection service. GuardDuty continuously monitors your AWS accounts and workloads
for malicious activity and delivers detailed security findings for visibility and remediation.
Shield is a managed DDoS protection service that safeguards applications that run on AWS.
CloudTrail offers auditing, security monitoring, and operational troubleshooting by tracking user activity and
API usage.
PREVENTION: DATA SECURITY

Data in motion compared to data at rest

• Data in motion travels from and to the internet; from and to devices such as smartphones, servers, personal
computers; or directly between these devices.
• Data at rest stay inside devices, such as smartphones, servers, USB keys, and hard drives.

You should use cryptographic techniques, encryption, and controls to secure data based on whether it is in
motion or at rest.

Cryptography and encryption

• Cryptography is the discipline that embodies the principles and techniques for providing data security,
including confidentiality and data integrity.
• Encryption is the process of using a code, called a cipher, to turn readable data into unreadable data for
another party. The cipher contains both algorithms to encrypt and to decrypt the data.
The goal of encryption is to achieve data confidentiality.
• A key is a series of numbers and letters that the algorithm uses to encrypt and decrypt data. Only the owners
of the keys can encrypt and decrypt data.

Encryption

• Uses of encryption:
– Algorithms to encrypt and decrypt data
– A secret key to ensure that only the key owners can encrypt and decrypt the data
• Types of encryption: symmetric, asymmetric, and hybrid

Symmetric encryption uses the same key to encrypt and decrypt the data. The key is a shared secret between
the sender and the receiver. Symmetric encryption is fast and reliable and is used for bulk data.
Asymmetric encryption uses both a private key and a public key (a key pair) to encrypt and decrypt the data.
Every user in the conversation has a key pair. Asymmetric encryption is more complex and much slower than
symmetric encryption. However, it provides more capabilities in the way that keys are managed.
A hybrid encryption approach uses both symmetric encryption and asymmetric encryption to protect the data
further.

Symmetric Asymmetric
Encryption Fast and straightforward Complex and time-consuming
Process speed Fast (even large amounts of data) Slow
One key: 128 or 256 bits Two keys: length can be 2048 bits
Keys
or higher
Is extremely secure; the risk of Provides additional security
Level of security
compromise if the shared key is lost services; the key is not shared
Becomes complex with more keys Includes an easy-to-manage key
Manageability
system
Only confidentiality Non-repudiation, authentication,
Security services provided
and more
Is used to securely transmit large Is used for authentication or digital
Use cases
amounts of data, or encrypt databases signatures
AWS CloudHSM and AWS KMS

AWS CloudHSM is a cloud-based hardware security module (HSM). You can use CloudHSM to generate and use
your own encryption keys on the AWS Cloud.

With AWS Key Management Service (AWS KMS), you can create and manage cryptographic keys and control
their use across a wide range of AWS services and in your applications.

Hashing
Hashing is a one-way encryption to create a signature of the file.

What is data integrity?

• Data integrity means ensuring that the data remains accurate and consistent when it is stored or travels over
a network.
• The data must not have been corrupted or tampered with.
• The data that you receive remains the same as the data that was sent.
• One way to determine data integrity is by using a hash mechanism.

Ensuring data integrity with hashing

• Hashing is used to ensure data integrity.


• A hash function generates a unique hash value or message digest from the content of a file or message.
• Recipients of the file or message can use the hash value to verify that the content has not changed during
transit.

Permissions
A permission grants a specific type of access to a resource (for example, write access to a file). Permissions are
classified into two types: discretionary (based on identity or other criteria) and role-based (based on an
assigned role).

A permission is assigned to a subject (a person, device, or system) to give the subject the resource access
ability defined by the permission.
PREVENTION: PUBLIC KEY INFRASTRUCTURE

Public key infrastructure (PKI) is a collection of technologies that are used to apply cryptography principles to
transfer information securely between two entities. It is based on a practical distribution and implementation
of keys, with a set of tools to achieve confidentiality, integrity, non-repudiation, and authenticity.

PKIs are used to implement the encryption of public keys but also to manage public-key-associated certificates.
Certificates are digital documents that are used in PKI to prove the ownership of a public key. Certificates
contain information about the entity that provided and verified the certificate, the entity to which the
certificate belongs, and the public key.

The entity that issues the certificate is called the issuer or the certificate authority. The entity that receives the
certificate is called the subject.

Enabling trust

Trust
• Prevents rogue systems from integrating between two computers that would like to exchange information
• Is achieved through the exchange of public keys that validate and identify the parties

Public keys
• Ensure that trust exists throughout your entire hierarchy
• Are located in the following: – System that is requesting a certificate
– System that is offering a service

PKI components

• The certificate authority is the entity that delivers the certificate.


• Certificates are documents that contain information about the certificate issuer, the entity that receives the
certificate (subject), and the public key.
• Revocation lists are lists that contain certificates that have been invalidated, which means that these
certificates cannot be trusted anymore.
• Registration authorities are entities (organizations or companies) that verify requests for certificates. If a
request is valid, they tell the certificate authority to provide a certificate.
• Entities are organizations or companies that are asking for certificates or verifying that the certificate is not
on a revocation list.
• Certificate templates are models that are used for the certificates.

Certificates
Digital certificates are electronic credentials that are used to represent online identities of individuals,
computers, and other entities on a network. Digital certificates are like personal identification cards.

Two types of certificates are available: certificates signed by a CA and self-signed certificates.

A certificate with public key and corresponding private key can be used for encryption and decryption. When
only the public key is used, the certificate establishes trust and performs encryption.
PREVENTION: IDENTITY MANAGEMENT

What is identity management?

• It is the active administration of subjects, objects, and their relationships regarding access permissions.
• It ensures that identities receive appropriate access to resources.
• It ensures that systems remain scalable in granting access to resources.

Identity management principles


Authentication, authorization, and accounting (AAA) are the primary principles of identity management.

• Authentication is concerned with proving and validating a user’s or application's identity.


• Authorization is the process of determining what permissions the user and applications have.
• Accounting establishes auditing measures by logging access, commands, and changes that users and
applications perform.

Authentication

Authentication factors can be categorized as the following:


– Something you know: For example, a password
– Something you have: For example, a smart card
– Something you are: For example, your fingerprint

Multi-factor authentication (MFA) is an authentication method that requires multiple methods or ways of
authentication.

Dictionary attacks
A dictionary attack attempts to systematically enter each word in the dictionary as a password until it finds a
match. Countermeasures for dictionary attacks include enforcing a strong password policy and locking out
access after a fixed number of unsuccessful attempts.

Password managers

• Operate over a centralized authentication system


• Improve security by requiring extra login steps
• Allow password resets
• Manage services that are used with specific credentials
• Store personal passwords on a local system

Single sign-on (SSO)


With single sign-on (SSO), users log in once and gain access to different applications without the need to re-
enter login credentials for each application.
AWS Single Sign-On

• A cloud-based service that you can use to centrally manage SSO access to all Amazon Web Services (AWS)
accounts, including user permissions and AWS Organizations

AWS SSO includes the following common features:


• One-click access to AWS accounts and cloud applications
• Ability to create and manage users and groups
• Compatibility with common cloud applications
• Compatibility with existing AWS Identity and Access Management (IAM) roles, users, and policies

Federated users
Federated users is a type of SSO implementation that is used between web identities. It uses a token to verify
user identity between distinct systems.

With SSO, individuals can sign into different networks or services by using the same group or personal
credentials. For example, by using SSO, you can use your Google account credentials to sign into Facebook.

Amazon Cognito

• It is an Amazon service that provides user management, authentication, and authorization for your web and
mobile apps.
• You can use Amazon Cognito by signing in with a user name and password through a third-party website.

Amazon Cognito includes two main components:


• User pools provide sign-up and sign-in options for app users.
• Identity pools grant users access to AWS services.

AWS Identity and Access Management (IAM) is a service that helps you control access to AWS resources in a
secure way by using authentication and authorization.
PREVENTION: AWS IDENTITY AND ACCESS MANAGEMENT (IAM)

IAM is a service that helps securely control access to AWS resources. You can use it to manage access to AWS
services and resources securely. Using IAM, you can create and manage AWS users and groups (to support
authentication). You can also use IAM for permissions to allow or deny their access to AWS resources (to
support authorization).

IAM uses access control concepts that you already know—such as users, groups, and permissions—so that you
can specify which users can access specific services.

Authentication
Use IAM to configure authentication, which is the first step because it controls who can access AWS resources.
IAM is used for user authentication, and applications and other AWS services also use it for access.

Authorization
IAM is used to configure authorization based on the user. Authorization determines which resources users can
access and what they can do to or with those resources. Authorization is defined through the use of policies. A
policy is an object in AWS that, when associated with an identity or resource, defines their permissions.

IAM reduces the need to share passwords or access keys when granting access rights to other people or
systems. It also makes it easy to enable or disable a user’s access.

Use IAM to centrally manage access regarding who can launch, configure, manage, and delete resources. It
provides granular control over access permissions for users, systems, or other applications that might make
programmatic calls to other AWS resources.

IAM features

Identity federation is a system of trust between two parties. Its purpose is to authenticate users and convey
the information needed to authorize their access to resources. In this system, an identity provider (IdP) is
responsible for user authentication. A service provider (SP), such as a service or an application, controls access
to resources.
Security credentials

Types of credentials Association


Email address and password Associated with an AWS account (root)
IAM user name and password Used for accessing the AWS Management Console
Typically used with the AWS Command Line Interface (AWS CLI) and
Access and secret access keys programmatic requests, such as application programming interfaces
(APIs) and software development kits (SDKs)
Used to provide as an extra layer of security, which can be enabled for
Multi-factor authentication(MFA)
AWS account root user and IAM users
Key pairs Used for only specific AWS services, such as Amazon EC2

IAM: Authorization

Authorization has the following characteristics:


• You allow users to access AWS services by granting authorization.
• You assign permissions by creating an IAM policy.
• Permissions determine which resources and operations are allowed to be used.

IAM is global. It is not on a per-Region basis. It applies across all AWS Regions.

IAM policies
An IAM policy is a formal statement of one or more permissions.

• Attach a policy to any IAM entity, such as a user, group, or role.


• Policies authorize the actions that an entity might or might not perform.
• A single policy can be attached to multiple entities.
• A single entity can have multiple policies attached to it.

Best practice
When you attach the same policy to multiple IAM users, put the users in a group and attach the policy to the
group instead.
DETECTION

Antivirus software

The threat of malware


Malware is an application that causes harm to a computer system. It interrupts one or many of the CIA triad
elements: confidentiality, integrity, or availability. Knowledge of malware, how to avoid infection, and how to
respond to corrupted systems are key elements of security management.

The following are types of malware:


• Viruses – Viruses are programs that can corrupt or delete data and propagate themselves from one system
to another.
• Worms – Worms are programs that spread themselves and consume resources destructively on a computer.
They have no executable file and rely on application weaknesses to deploy themselves. The author of a worm
can control the infected computer remotely. Worms can be difficult to isolate because they spread quickly.
Examples include MyDoom, Sobig, and Stuxnet.
• Bots – Bots are used to control computers or launch distributed denial of service (DDoS) attacks against
vulnerable systems. An example is Poison Ivy.
• Backdoors – A backdoor (also known as a Trojan horse) is often a secret server that steals information from
the victim’s system. It allows an intruder into a system. You can know about the backdoor if you scan the
system and the network to find patterns of traffic. Examples include Sub7, GirlFriend, and Zeus.
• Rootkits – A rootkit cloaks itself by replacing system files that can reveal its presence. It is used to retrieve
information. It is difficult to identify and remove because it can become part of the operating system. Removal
often requires a system reformat. An example is Hacker Defender.
• Spyware – Spyware jeopardizes privacy and typically comes embedded into applications that look free and
interesting to use. As people are doing more finance and other personal activities online, these activities can
be detected and revealed, and information can be stolen. An example is Real-time spy.
• Adware – Adware deploys advertising content and monitors user activity, such as visited websites. It is
similar to spyware, but it focuses on ads and what a user clicks. Adware is often embedded in shareware
applications. An example is Fireball.
• Ransomware – Ransomware locks systems or makes data unavailable until the user pays a ransom.

Malware infects a system through different methods, including the following:


• Untrusted websites – Untrusted websites are websites whose identity can’t be identified and might have
malicious intent.
• Removable devices – These devices can be used to infect a system. For example, a USB device is mailed to
you. You open it, and it contains a backdoor that gives remote access to your system to an unauthorized user.
• Emails – An email can have attachments with viruses or malware.

What is antivirus software?


Antivirus software is a specialized program that prevents, detects, and removes malware.

• Built in as part of the operating system (OS) or developed by third-party vendors


• Uses malware signature definitions
• Scans a computer’s memory and file system for matches against the malware definitions
• Removes identified malware
The following are some best practices when using an antivirus program:
• Regularly update your antivirus or anti-malware software.
• Frequently scan your system.
• Scan incoming communications (for example, emails and attachments).

Intrusion detection system (IDS)


An intrusion detection system (IDS) is a hardware or software solution that monitors a network or a computer
system to detect intrusions or malicious activity. When this kind of activity happens, the IDS generates alerts to
notify security personnel.

An IDS can detect an attack by using different mechanisms, including the following:
• Anomaly-detection based – The IDS compares the current traffic pattern or system activity against
established baselines for any deviation.
• Signature-based detection – The IDS monitors and analyses the traffic for known patterns of attack.

There are two main types of intrusion detection systems:

• Network-based intrusion detection system (NIDS):


– Monitors network traffic, detects threats, and raises alerts
– Is installed on the network

• Host-based intrusion detection system (HIDS):


– Monitors logs and critical files on the server, detects threats, and raises alerts
– Is installed on a server

Amazon GuardDuty
GuardDuty is a threat detection service that continuously monitors your AWS accounts and workloads for
malicious activity. It delivers detailed security findings for visibility and remediation.

When you activate GuardDuty and configure it to monitor your account, GuardDuty automatically detects
threats by using anomaly detection and machine learning techniques. You can view the security findings that
GuardDuty produces in the GuardDuty console or through Amazon CloudWatch Events.

GuardDuty detects unauthorized and unexpected activity in your AWS environment by analysing and
processing data from different AWS service logs. These logs include the following:
• AWS CloudTrail event logs
• Virtual private cloud (VPC) flow logs
• Domain Name System (DNS) logs

GuardDuty extracts various fields from these logs and uses them for profiling and anomaly detection.
AWS CLOUDTRAIL

CloudTrail is an auditing, compliance monitoring, and governance tool from AWS. It is classified as a
Management and Governance tool in the AWS Management Console.

CloudTrail logs, continuously monitors, and retains account activity related to actions across your AWS
infrastructure, which gives you control over storage, analysis, and remediation actions.

CloudTrail benefits

• It increases your visibility into user and resource activity. With this visibility, you can identify who did what
and when in your AWS account.
• Compliance audits are simplified because activities are automatically recorded and stored in event logs.
Because CloudTrail logs activities, you can search through log data, identify actions that are noncompliant,
accelerate investigations into incidents, and then expedite a response.
• Because you are able to capture a comprehensive history of changes that are made in your account, you can
analyse and troubleshoot operational issues in your account.
• CloudTrail helps discover changes made to an AWS account that have the potential of putting the data or the
account at heightened security risk. At the same time, it expedites AWS audit request fulfilment. This action
helps to simplify auditing requirements, troubleshooting, and compliance.

CloudTrail best practices

• Turn on CloudTrail log file integrity validation.


• Aggregate log files to a single S3 bucket.
• Ensure that CloudTrail is enabled across AWS globally.
• Restrict access to CloudTrail S3 buckets.
• Integrate with Amazon CloudWatch.
AWS CONFIG

AWS Config is a service used for assessing, auditing, and evaluating the configuration of your AWS resources.

• Provides AWS resource inventory, configuration history, and configuration change notifications
• Provides details on all configuration changes
• Can be used with AWS CloudTrail to gain additional details on a configuration change
• Is useful for the following:
– Compliance auditing
– Security analysis
– Resource change tracking
– Troubleshooting

AWS Config configuration management capabilities

With AWS Config, you can perform the following configuration management tasks:
• Retrieve an inventory of AWS resources.
• Discover new and deleted resources.
• Record configuration changes continuously. You can determine overall compliance against the configurations
that your internal guidelines specify.
• Get notified when configurations change and analyse detailed resource configuration histories.

AWS Config security capabilities


AWS Config helps you meet your security and compliance objectives.

AWS Config:
• Monitors resource usage activity and configurations to detect vulnerabilities
• Continuously evaluates the configuration of resources against the AWS Config rules that you define:
– Security prevention rules
– Compliance rules
•Helps troubleshoot security configuration issues
AWS Config rules
An AWS Config rule represents a desired configuration for a resource and is evaluated against configuration
changes on the resource.

AWS Config managed rules


AWS Config provides predefined rules that are used to evaluate whether your AWS resources comply with
common best practices.

Example security-related managed rules:


- IAM user passwords meet specified requirements.
- S3 buckets are not publicly accessible.
- EC2 instances do not have a public IP address.
- EBS volumes are encrypted.
RESPONSE

Event response process

Stages of a typical response to any malicious event

• When a security alert is activated, the alert must be verified because false positives can happen, especially
with a system such as an automated intrusion detection system (IDS).
• If the alert is verified, then the event must be investigated. What is the scope of the attack?
• The first step to respond to the attack is to contain infected elements if there are any, such as hosts infected
by a virus. Then, block access to network addresses.
• Notify the departments or the teams that will be affected that they might have limited access to the systems
that they use. Stakeholders might be customers that won’t be able to use a website.
• Recover to get back to business as soon as possible: add security rules, rebuild infected systems, recover
data, and take other appropriate steps.
• Finally, see whether there is a way to strengthen the system to avoid another attack or recover faster. You
can also implement new procedures for the team in case of an attack.

Understanding the business continuity plan (BCP) and disaster recovery plan (DRP)

Two strategies or processes are important:


• BCP: How to run the business in a reduced capacity (The BCP is a preventive and proactive management
tool).
- Lists different disaster scenarios
- Lists actions to keep the business running
- Is not activated during an outage

• DRP: How to recover from an outage or loss and return to a normal situation as quickly as possible.
- Primary goal: Restore business functionality quickly and with minimum impact.
- Security goal: Do not lower the level of controls or safeguards that are in place.
- Follow-on goal: Prevent this threat, exploit, or disaster from happening again.
Disaster recovery: Understanding recovery time objective (RTO) and recovery point objective (RPO)

• Recovery time objective (RTO): The maximum acceptable delay between the interruption of service and
restoration of service. The RTO determines an acceptable length of time for service downtime.
How quickly do you need to recover IT infrastructure to maintain business continuity?

• Recovery point objective (RPO): The maximum acceptable amount of time since the last data recovery point.
The RPO is directly linked to how much data will be lost and how much will be retrieved.
How much data can you lose before the business suffers?

Work recovery time (WRT) involves recovering or restoring data, testing processes, and then making the
system live for production. It corresponds to the time between systems and resource recovery, and the start of
normal processing.
The maximum tolerable downtime (MTD) is the sum of the RTO and the WRT. In other words, MTD = RTO +
WRT.
MTD is the total time that a business can be disrupted after a disaster without causing any unacceptable
consequences from a break in business continuity. Include the MTD value as part of the BCP and DRP.

Disaster recovery options


Recovery from an outage typically relies on the availability of a backup or replication solution that you
previously implemented.

• Backup (can be traditional tape storage)


• Replication
– Snapshot-based (Writes only changed data since the last snapshot)
– Continuous (Focuses on resuming to normal operations quickly)
• Pilot light
– Minimal version of an environment is always running in the cloud
ANALYSIS

Analysis is the final phase of the security lifecycle. In the analysis phase, you review the cause of security
incidents and analyse current security controls to determine weaknesses. The objective is to improve and
strengthen those controls to better protect your network, facilities, and organization.

General guidelines for analysis

Ensure that each threat yields a better security solution even if no breach occurred.
Have flexibility when considering option to add to the solution.
Maintain a testing environment to test solutions to potential threats.

Types of security tests


You can conduct security testing during the analysis phase. Doing security tests in the analysis phase is useful
in order to mimic what could happen if your system were under attack. Conducting security tests gives you an
opportunity to implement solutions to better prepare against these attacks.

The types of testing include the following:


• External vulnerability assessment – A third party evaluates system vulnerabilities with little knowledge of
the infrastructure and components.
• External penetration test – A third party with little knowledge of the system actively tries to break into the
system in a controlled manner.
• Internal review of applications and platforms – A tester with some or full knowledge of the system validates
the effectiveness of the following for known susceptibilities:
- Controls in place
- Applications and platforms

Root cause analysis (RCA)


RCA is used to identify the origin of security breaches.

Steps to conduct an RCA


1. Describe the issue that happened and what it led to. How did it happen? Where? What are the
consequences?
2. Go back to the baseline situation, and analyse each event leading up to the issue.
3. Analyse events to understand the links between them, and identify which event most likely caused the
issue. This mechanism is called event correlation.
4. Create a visual representation (for example, a diagram or graph) of the sequence of events from the origin
to the final problem.

Risk assessment
Risk is the likelihood of a threat occurring against a particular asset and the possible impact to that asset if the
threat occurs.
A risk assessment helps to identify and rank risk. 1. Identify threats
2. Identify vulnerabilities
3. Determine likelihood
4. Determine impact
5. Determine risk
Risk response strategies

Monitoring and logging

• Logs – Provide data that is used to examine IT systems and processes


– Can be both inputs and outputs of monitoring
• Monitor logs for – Changes
– Exceptions
– Other significant events
• Records produced from monitoring become logs for further analysis.

Environment monitoring

A company’s Acceptable Use Policy (AUP) defines how employees or users can be monitored on a company’s
network:
– At work
– Remotely
– On mobile devices

Types of monitoring

AWS monitoring services

AWS provides services for monitoring.


• Amazon CloudWatch monitors resources and applications in the AWS Cloud and on-premises - Monitoring as
a service (MaaS)
• AWS Config records and evaluates configurations of your AWS resources.
• Amazon Managed Service for Prometheus provides highly available, secure, and managed monitoring for
your containers.
• Amazon GuardDuty protects your AWS accounts with intelligent threat detection.
• Amazon Macie is a fully managed data security and data privacy service that uses machine learning and
pattern matching to discover and protect your sensitive data in AWS.
Logging

Logging Policy
Identify which resources and activities in your enterprise must be logged. Capture this information in a logging
policy. Also, define how logs are managed.
It is important to protect log information from unauthorized access and to back up logs regularly. To ensure
that analysis results are correct, keep the clocks on all log servers accurate and synchronized.

Protection of log information

• Keep logs on the original device, a log server, or both.


• Control physical and logical access to a log server.
• Log backup and recovery processes.
• Follow a retention policy.
• Check timestamps.

AWS logging services

• AWS CloudTrail tracks user activity and API usage.


• AWS Config records and evaluates configurations of your AWS resources.
• Amazon Virtual Private Cloud (Amazon VPC) Flow Logs capture information about IP traffic.
AWS TRUSTED ADVISOR

Trusted Advisor is an online resource to help you reduce cost, increase performance, and improve security by
optimizing your AWS environment.

It provides best practices (or checks) in five categories:


1. Cost Optimization – Save money on AWS by reducing unused and idle resources or making commitments to
reserved capacity.
2. Performance – Improve the performance of your service by checking your service limits, ensuring that you
take advantage of provisioned throughput, and monitoring for overutilized instances.
3. Security – Improve the security of your application by closing gaps, activating various AWS security features,
and examining your permissions.
4. Fault Tolerance – Increase the availability and redundancy of your AWS application by taking advantage of
automatic scaling, health checks, multiple Availability Zones, and backup capabilities.
5. Service Limits – Check for service usage that is more than 80 percent of the service limit.

The status of the check is shown by using colour coding on the dashboard page:

Trusted Advisor features


Trusted Advisor provides a suite of features so that you can customize recommendations and proactively
monitor your Amazon Web Services (AWS) resources:

• Trusted Advisor notifications – Stay up to date with your AWS resource deployment. You will receive a
weekly notification email message when you opt in for this service.
• Access management – Control access to specific checks or check categories.
• AWS Support application programming interface (API) – Retrieve and refresh Trusted Advisor results
programmatically.
• Action links – Access items in a Trusted Advisor report from hyperlinks that take you directly to the console.
From the console, you can implement the Trusted Advisor recommendations.
• Recent changes – Track recent changes of check status on the console dashboard. The most recent changes
appear at the top of the list to bring them to your attention.
• Exclude items – Customize the Trusted Advisor report. You can exclude items from the check result if they
are not relevant.
• Refresh all – Refresh individual checks or refresh all the checks at once by choosing Refresh All in the upper-
right corner of the summary dashboard. A check is eligible for 5-MinuteRefresh after it was last refreshed.
Trusted Advisor security checks
Trusted Advisor provides popular performance and security recommendations to all AWS customers. The
following Trusted Advisor checks are available to all customers at no cost:

1. AWS Identity and Access Management (IAM) use: Checks for the existence of at least one IAM user to
discourage the use of root access
2. Multi-factor authentication (MFA) on root account: Checks the root account and warns you if MFA is not
activated
3. Security groups – Specific ports unrestricted: Checks security groups for rules that allow unrestricted access
(0.0.0.0/0) to specific ports
4. Amazon Simple Storage Service (Amazon S3) bucket permissions: Checks buckets in Amazon S3 that have
open access permissions or that allow access to any authenticated AWS user.
5. Amazon Elastic Block Store (Amazon EBS) public snapshots: Checks the permission settings for your
Amazon EBS volume snapshots and alerts you if any snapshots are marked as public
6. Amazon Relational Database Service (Amazon RDS) public snapshots: Checks the permission settings for
your Amazon RDS database (DB) snapshots and alerts you if any snapshots are marked as public.
SECURITY BEST PRACTICES FOR CREATING AN AWS ACCOUNT

Securing your AWS account


AWS is responsible for the security of the cloud. You are responsible for security in the cloud.

The AWS services that you use will determine your responsibility. In addition, you are responsible for other
factors, including your data's sensitivity, your company's requirements, and applicable laws and regulations.

Understanding when to use the root directory


The root user for the AWS account has access permissions to everything (all AWS services and resources).
Therefore, you should protect that access and use it only when necessary. When you first set up your AWS
account, you authenticate your root account with the email address that you want associated with your AWS
account.

The following tasks require that you sign in to AWS with your root user credentials:
• Change your account settings
• Restore AWS Identity and Access Management (IAM) user permissions
• Change your AWS Support plan or cancel your AWS Support plan
• Activate IAM access
• View certain tax invoices
• Close your AWS account
• Register as a seller
• Configure an Amazon Simple Storage Service (Amazon S3) bucket
• Edit or delete an S3 bucket

Security best practices for your AWS account: Stop using root user
AWS recommends that if you have access keys for your account root user, you remove them as soon as
possible. Before you remove the access keys, confirm that they are not being used anywhere in your
applications.

Security best practices for your AWS account: Requiring MFA


Multi-factor authentication (MFA) is an authentication method that requires the user to provide two or more
verification factors to gain access to a resource. For example, this resource might be an application, an online
account, or a virtual private network (VPN). MFA is a core component of a strong IAM policy. Rather than
asking for only a user name and password, MFA requires one or more additional verification factors, which
decreases the likelihood of a successful cyber-attack.

Security best practices for your AWS account: AWS CloudTrail


Activate AWS CloudTrail as soon as possible after opening your AWS account. CloudTrail is a log-monitoring
tool. It tracks API calls through the AWS Management Console, the AWS Command Line Interface (AWS CLI),
your applications, and third-party software. It then publishes the log files to the S3 bucket of your choice.

Security best practices for your AWS account: Billing report

Activate billing report, such as the AWS Cost and Usage Report:
To receive billing reports, you must have an S3 bucket in your AWS account to receive and store reports.
When you set up the report, you can select an existing S3 bucket or create a new one. Whichever you choose,
limit access to only the ones who need it.
AWS COMPLIANCE PROGRAM

Regulatory compliance and standards


Security compliance ensures that security controls meet regulatory and contractual requirements.

Regulatory requirements can be the following:


• Country-specific: For example, most governments have regulations that control the allowed storage location
of data related to national security.
• Industry-specific: For example, the financial industry has regulations that govern the processing of credit
card transactions. The health care industry has regulations to protect the privacy of a patient’s medical
records.

Contractual requirements can be in the form of the following:


• Service level agreement (SLA): A contract that defines the quality and performance expectations of the
services provided to a customer
• Project labour agreement (PLA): A contract that defines the source, terms, and conditions of personnel used
on a project

Regulations vary between different localities, jurisdictions, or cultures.

Compliance levels and noncompliance


Compliance levels vary by authority type. Noncompliance has consequences.

• External authority: - Government or laws – Mandatory compliance


- Open standards – Compliance recommended to participate
- Best practices – Optional compliance

• Consequences of noncompliance: - Government or laws – Civil, criminal, or financial penalties


- Open standards – Financial penalties or participation denied
- Best practices – Loss of customers, partners, or revenue

National and international cybersecurity standards

Some of the organizations focused on developing cybersecurity standards:


• National Institute of Standards and Technology (NIST)
• European Union Agency for Cybersecurity (ENISA)
• European Telecommunications Standards Institute (ETSI)
• International Organization for Standardization (ISO)
• Internet Engineering Task Force (IETF)
• Institute of Electrical and Electronics Engineers (IEEE)
• Committee of Sponsoring Organizations of the Treadway Commission (COSO)

PCI DSS

• Payment Card Industry (PCI) Data Security Standard (DSS) is an international regulated set of requirements
intended to maintain a secure environment for payment card transactions.
AWS compliance program
Recall that in the AWS Cloud, security is a shared responsibility between the customer and AWS. AWS is
responsible for the security OF the cloud, and the customer is responsible for the security IN the cloud.
Specifically, AWS handles the security of the physical infrastructure that hosts customer resources, and
customers are responsible for the security of everything that they put in the cloud.

Similarly, compliance is a shared responsibility between you (the customer) and AWS. To aid your compliance
efforts, AWS regularly achieves third-party validation for thousands of global compliance requirements. AWS
continually monitors these requirements to help you meet security and compliance standards for finance,
retail, healthcare, government, and beyond. You inherit the latest security controls operated by AWS,
strengthening your own compliance and certification programs. You also receive access to tools that you can
use to reduce the cost and time to run your own specific security assurance requirements. AWS supports
security standards and compliance certifications, including PCI DSS, HIPAA, and GDPR.

AWS risk and compliance program

The AWS risk and compliance program:


• Provides information about AWS controls.
• Assists customers in documenting their security compliance framework.

The AWS risk and compliance program consists of three components:


• AWS business risk management
• AWS control environment and automation
• AWS certifications and attestations

AWS supports many security standards and compliance certifications. These standards and certifications help
customers satisfy compliance requirements for most regulatory agencies around the world.
AWS SECURITY RESOURCES

Amazon Web Services (AWS) communicates its security and control environment, which is relevant to
customers, in the following ways:
• Doing industry certifications and independent third-party attestations
• Providing information about AWS security and control practices in whitepapers and web content
• Providing certificates, reports, and other documentation directly to AWS customers under a nondisclosure
agreement (NDA)

AWS account teams


AWS account teams serve as a first point of contact to help guide you through your deployment and
implementation. These teams point you toward the right resources to resolve security issues that you might
encounter.

AWS Support plans

• AWS provides basic support plans for all AWS customers. The basic support plan includes the following:
– Customer service and communities
– AWS Trusted Advisor
– AWS Persona Health Dashboard

• If additional support is needed, three tiers of support are available:


– AWS Developer Support plan (for testing within AWS)
– AWS Business Support plan (for production workloads)
– AWS Enterprise Support plan (for business-critical workloads)

AWS Professional Services and AWS Partner Network


The AWS Partner Network (APN) is a group of cloud software and service vendors. This network includes
hundreds of certified APN Partners worldwide who can assist customers with their security and compliance
needs.
AWS Professional Services and APN both help customers develop security policies and procedures based on
well-proven designs. They help to ensure that the customer’s security design meets internal and external
compliance requirements.

AWS advisories and bulletins


AWS provides advisories around current vulnerabilities and threats. Customers can work with AWS security
experts to address concerns such as reporting abuse, reporting vulnerabilities, and conducting penetration
tests.
AWS security benefits

• What makes AWS security different?


– AWS is the only commercial cloud service that has its services vetted and approved for top-secret
workloads.

• What are some benefits?


– With AWS security, you can securely scale your infrastructure.
– With AWS security, you can automate security tasks.
– AWS security includes integrated security services.
– A large infrastructure and environment is prebuilt for the customer.
– AWS security is strategic and focuses on preventing, detecting, responding, and remediating
AWS re/start

Python
Programming
INTRODUCTION TO PROGRAMMING

Automation
Automation refers to any technology that removes human interaction from a system, equipment, or process.
Scripts are often written to automate labour-intensive tasks and streamline workflows.

How is software written?


Software is written into a text file by using a programming language, which is either interpreted or compiled
when it is run.

Some text editors include features that help programmers write code.
Examples: – Microsoft Visual Studio Code
– Sublime Text
– Vi or Vim
– nano
– GNU emacs
– Notepad++
– TextEdit

Software is written by using a computer language

Each language has its own: – Grammar and syntax


– Common uses
– Community

Examples: – Python
– JavaScript
– C#
– C/C++

Integrated development environment (IDE)


Compilers and interpreters

Compilers and interpreters take the high-level language that you are developing in, and turn it into low-level
machine code.
Compilers do this process all at one time after changes are made, but before it runs the code.
Examples: C/C++, Basic, GoLang.
Interpreters do this process one step at a time while the code is running.
Examples: Python, Ruby, JavaScript.

Categorize a value as a data type


A data type is the classification of a value that tells the computer how the programmer intends the data to be
interpreted.
Data is typed so that the interpreter or compiler knows whether it is a string, integer, Boolean, or other data
type.

Data Value Data Type


45 Integer
290578L Long
1.02 Float
True Boolean
“My dog is on the bed” String
“45” String

Why must the type of data be tagged?

In memory, everything consists of 0s and 1s.


Data typing tells the computer how to: – Encode a data value into memory
– Decode a data value out of memory

What is a variable?
A variable is an identifier in your code that represents a value in memory.
The variable name helps humans to remember what the value means.
Note: Assignment operator is an equal sign (=).

Combine values into composite data types

Primitive data type: Data types that are built into a coding language with no modification.
Composite data type: Combines multiple data types into a single unit.

Functions
Functions are collections of instructions that can be called repeatedly in a program.
- Functions do something useful.
- Functions can return a value (to be stored in a variable).
- Functions can return a value based on input values.
- Functions can accept values as input.
- Functions can accept many values as input.

Or, a developer can create a composite data type.


- Composite data types can be returned.
- Composite data types can be in an array.
Follow the execution path of a program

What does execution path mean?

• The sequence of steps that the program performs when it runs


• The program might ... – Come to an either-or choice
– Come to multiple choices
– Perform work on each item in a loop

•The programmer must be able to predict what those steps will be...
– When they write the code initially
– When they debug problems that they encounter

Version control
A version control system is software that tracks versions of your code and documents as you update them.
Version control can be done locally on your computer or by using a website that is dedicated to saving these
versions.
Collaboration is doing version control, but in the cloud or on a dedicated website so that multiple people can
work on a project.

Advantages
- Ease of access to project changes
- Error tracking
- Security

Utilizing cloud infrastructure


The cloud, or a dedicated website, is useful for storing changes in code.
Version control that is only on a local computer can be easily lost, even if it is more secure than saving multiple
versions of a file.
Data that is stored in the cloud has a reduced risk of being lost.

Version control tools


Several version control tools are available: • Git and GitHub
• GNU arch
• Mercurial
INTRODUCTION TO PYTH ON

Python is a free, easy-to-learn, general-purpose programming language. It has a simpler syntax compared to
other programming languages. Python is neither compiler or interpreter.

Why Python?
- The interpreter enables fast exploratory programming.
- Dynamic typing makes it easy to write quick scripts.
- Python syntax is simple when it is compared to other languages.
- Python can support object-oriented, structural, and functional programming styles.

Another reason to use Python is that it works across platforms. It works on macOS, Linux, Microsoft Windows,
and other platforms.

Where can you write Python?


Python can be written in any text editor if you have the interpreter also installed. Many developers use special
programs that are called integrated development environments (IDEs). IDEs help with finding syntax and
exception errors.
Example programs: Python(x,y), AWS Cloud9, Microsoft Visual Studio Code, PyCharm, Vim.

Integrated development environment

The following core capabilities are associated with IDEs:


• Syntax highlighting –Highlights keywords within the programming language
• Code completion –Similar to automatic completion for natural languages on modern cellular phones
• Debugging –Enables line-by-line inspection of the code while it is running with breakpoint capabilities
• Version control –Integrates popular version control systems, such as git and subversion

AWS Cloud9: Cloud-based IDE


AWS Cloud9 is a cloud-based IDE that lets you write, run, and debug your code with a browser. It
combines the rich code editing features of an IDE with access to a full Linux server for running and
storing code. Some of the editing features include code completion, hinting, and step-through
debugging.

AWS Cloud9 provides a few key benefits: • Start projects quickly and code with only a web browser.
• Code together in real time.
• Build serverless applications with ease.

AWS Lambda
• Upload your code to AWS Lambda.
• Set up your code to trigger from an event, such as a user who is visiting your webpage.
• Lambda runs your code only when it is triggered, and it uses only the compute resources that are
needed.
• You pay only for the compute time that you use.
• Multiple languages are supported.
• AWS Cloud9 is included in the Lambda interface, so you can share code with developers.
Other tools: Shell scripting
Shell scripting commands are run directly from the command line of an operating system. They are available
on any machine and on any operating system without the need to install new software. Different
environments require different syntax or types of shell scripting, such as Bash and Zshell.

Shell scripting versus Python

• Shell scripting can be a powerful tool for system administration and command-line work, but it can be
challenging when you want to use more complicated data structures.
• For example: Python can perform some actions—such as creating an HTTP server—in a single line. However,
it could require many lines of code to do the same action in Bash.
• Python has many external libraries and resources. It is a complete programming language.
PYTHON BASICS

• Python can be installed on Microsoft Windows, macOS, and Linux computers.


• Python files have the.py extension.

Python Syntax Basics


Python uses indentation and spacing to group blocks of code together.
• If you get runtime errors, check your spacing first.
• Next, check your indentation for any missing punctuation, such as colons.
Python is also case sensitive. Capitalization matters.

Identifiers
In Python, identifier is the name for entities like class, functions, and variables. It helps differentiate one entity
from another entity.
When you name objects in Python, you must observe some rules:
- An identifier cannot start with a digit. 1variable is invalid, but variable1 is valid.
- Keywords cannot be used as identifiers.
- You cannot use special symbols like !, @, #, $, and % in your identifiers.
- Identifiers can be any length.

Functions

• A function tells the computer to do a specific task.


• Functions have a name and are invoked by adding parentheses after the name.
• Many functions take arguments, which are information that the function uses to do its specific task.
• One useful function is the print function.

Comments

• Comments are notes to you and to other developers.


• Comments describe the contents of a program and how it works so that a person who reviews the source
code can understand it.
• Use the pound (#) symbol to start writing a comment.
• By starting a line of text with #, you tell the interpreter not to run this line as code.

Data types
Data types determine what kind of data is stored in an object.
Python stores the type of an object with the object. When the operation is performed, it checks whether that
operation makes sense for that object. (This technique is called dynamic typing.)
Basic Data types

Data types: Mutable versus immutable


Mutable means changeable. In Python, some data types are mutable, and others are immutable.

Converting Data types


Python has specific commands that you can use to change an object from one data type to another.
• float(): In the example, 4 is assigned to the variable x. By its nature, 4 is an integer. However, x must be
identified as a float, so the float() command was used. The result is 4.0.
• int(): By passing x back into the int(x) command, it receives 4 back. This command truncates the decimals.

Note: Using int()is not the same as rounding. It only removes any decimal digits that the number has. It does
not round the decimal numbers up or down, so be careful.
Strings

• Strings contain text, which can be a single character or paragraphs of text.


• Strings are characters that are enclosed between quotation marks.
• Strings can contain numbers.
• Three ways of notating a string are: – Single quotation marks (' ')
– Double quotation marks (" ")
– Triple quotation marks (''' ''')

The triple quote notation allows a string to span multiple lines and include non-printable formatting characters
such as Newline and Tab.

String concatenation

• Strings are immutable. When you manipulate them, you create a new string.
• You can concatenate or add strings together, which creates a new string.
• It is possible to create an empty string. – Example: x = ""

Variables

• When you do something with variables, Python looks up the values to understand what to do.
• You can have as many variables as you want.
• You can name your variables whatever you want, within certain restrictions.
• Restrictions are: – A variable name can only contain letters, numbers, and underscores ( _ ).
– A variable name cannot begin with a number.
– A variable name cannot be a keyword in Python.

Operators
Operators in Python are used for math, equivalence tests, and string operations.
Operator precedence

Operator Description
(expressions...), [expressions...], Parenthesized expression, list, dictionary, set
{key: value...}, {expressions...}
** Exponentiation
+, -, ~ Positive, negative, bitwise NOT (Unary operators)
*, /, %, // Multiplication, division, remainder, floor division
+, - Addition, subtraction
<<, >> Left and right bitwise shift
in, not in, is, is not, <, <=, >, >=, !=, == Membership operators, identity operators,
comparison operators
and, or, not Logical (Boolean) operators
=, +=, -=, *=, /=, %=, //=, **=, &=, |=, ^=, >>=, <<= Assignment operators

Statements

• A statement is usually a line of code, and each line of code that you have seen so far is a single statement.
• A statement is an individual instruction to Python.

Exceptions

• An exception raises an error.


• Examples include: – Referencing a variable by the incorrect name
– Dividing a number by zero
– Trying to read a file that does not exist on your computer
• Exceptions result in a stack trace, which is a listing of the various ways that something went wrong.

Tuple
Tuples are used to store multiple items in a single variable.
Tuple is one of 4 built-in data types in Python used to store collections of data, the other 3 are List, Set, and
Dictionary, all with different qualities and usage.
A tuple is a collection which is ordered and unchangeable.
Tuples are written with round brackets.

thistuple = ("apple", "banana", "cherry")


print(thistuple)
FLOW CONTROL

An if conditional statement evaluates a condition and, if true, it runs a block of code.


– elif and else statements follow if statements.

Conditionals in code

Some things to note: • The colons after the conditionals are important.
• You can also use the print function.

Loops

• Loops are a technique that tells Python to run a block of code repeatedly.
• Python runs loops for a certain number of times, or while a certain condition is true.
• The two types of Python loops are called for and while.

While loops

• While loops can run indefinitely, so you must include the condition for the loop to stop. Otherwise, the code
creates an infinite loop.
• It is common to use an iterative counter with loops. As the loop completes, the counter increases (or
decreases). When the counter reaches a specific number, the loop stops.

Lists

• Lists are a mutable data type. Lists can contain multiple data types (strings, ints, floats, and even other lists).
• Lists are denoted with brackets ([ ]) on each end.
• Values are enclosed in brackets, and they are separated with commas.
• Any number of items can be in a list—even zero (no) items.
For loops

• A for loop reads: for each element in <thing>, do a certain task.

Loops and Lists

• For loops and lists work well together.


• For every item in this list, Python prints that item.
– Every time it calls num, it assigns a value from the list (1, then 2, . . . ) to num.
– The loop then prints the value.
– After it goes through the entire list of values, the loop stops.

Dictionaries

• Dictionaries contain immutable keys, which are associated to their values. Keys must be immutable data
types.
• Dictionaries can be nested inside each other.
• To create an empty dictionary, use a pair of braces with nothing inside: {}
• Keys are separated from their values with a colon: {"Key":"Value"}
• Retrieve a value in the dictionary by its key: myDict.get("key") or myDict["key"]

Input

• The input() function asks the user to enter text and saves the result to a variable.
• One optional argument is a prompt for the user, as a string.
FUNCTIONS

In Python, a function is a named sequence of statements that belong together. Their primary purpose is to
help organize programs into chunks that match how you think about the solution to the problem.

First, define the function. Name it and put placeholders for arguments. Indent lines of code inside the function,
like code in loops.

Example: def <function name>(argument) :


<things to do>

• Functions are used when you must perform the same task multiple times in a program.
• Functions are called by name and the function call often includes arguments that the function code needs for
processing.
• Python includes many built-in functions, such as print and help.

Types of functions

Function arguments enable developers to pass values to a function. For example, a function that is called
setColor could include a string for the color that is being passed. Such a call would appear as setColor(“red”).

Functions enable developers to use the same code many times without retyping the statements.
MODULES AND LIBRARIES

What are modules and libraries?


Functions are blocks of code that do specific tasks.
Functions can be put into modules. Modules are separate Python files that can be imported into other Python
applications. Importing modules makes it possible to reuse code.
Libraries are collections of modules. By putting modules into libraries, it is possible to import a large amount of
programming capability quickly.

Standard library
The Python standard library is a collection of script modules that a Python program can access. It simplifies the
programming process and reduces the need to rewrite commonly used commands.
Python libraries can also consist of modules that are written in C.

Navigating the Python standard library


The Python standard library is a collection of modules that make programming easier by removing the need to
rewrite commonly used commands
The library contains built-in modules to access such functionality as time (time), getting system information
(sys), querying the operating system (os), and many more.
Using the standard library where possible makes code easier to maintain and port to other platforms.

Why import a module?


Module types

Modules can be:

1. Created by you. You might need Python to complete a specific set of tasks or functions that are grouped
together. If so, it can be easier to bind them together into a module.
2. External sources (created by others).
3. Pre-packaged with Python and part of the standard library (examples: math, time, and random).

Importing modules
Standard library modules are imported by using the import command. You can also import specific functions
or constants from a module by using the from command.

Examples include:

You must import the module or the function before you can use it, even if the module is part of the standard
library.

Creating modules

To make your own module:


• Create a file name with a .py extension ( for instance mymodule.py )
• Add your code to define some functions
• You can now import mymodule in other python files and use the code defined in the module

File handlers
File handlers enable Python to read and write to files.
Exception handling

• Exception handling is another form of flow control.


• When an error occurs, instead of failing and quitting the program, you can use a try/except block.
• You must specify the exception (or exceptions) that you expect might occur.

Examples: Exception handling

OS: Operating system module

• OS is part of the Python standard library.


• The OS module provides operating system functionality. The output of the module depends on the
underlying operating system, with generally the same inputs.
• Common capabilities in the OS module are environment variable information, file manipulation, directory
traversal, and process management.
• Programs that import and use the OS module are generally more portable between different platforms.

OS capabilities

• Host operating system:


getlogin – Returns the name of the logged in user
getgrouplist – Returns a list of group IDs that a user belongs to
getenv – Returns the value of the environment variable that is passed to it
uname – Returns information to identify the current OS
system – Is used to run commands in a subshell of the system

• Common functions for files:


chown – Changes the ownership of a file
chmod – Changes the access permissions of a file
remove – Removes the file at the given path

• Common functions in the os module for directories:


getcwd – Gets the current working directory
listdir – Lists the contents of the current directory
mkdir – Creates a new directory
JSON

• JSON stands for JavaScript Object Notation.


• JSON is a standard file format that transmits data objects. It is language independent. It was originally
derived from JavaScript, but now most modern languages include the ability to generate and parse JSON
(including Python).
• JSON is used to serialize objects, which means that it turns code into a string that can be transmitted over a
channel.

• Four useful functions in the JSON module are: dump, dumps, load, loads
dump and dumps: Turn various kinds of structured data into a string, which can be written to a file.
load and loads: Turn a string back into structured data.
dump and load work directly with files.
dumps and loads work with strings.
–The s at the end of the name is for string.

Uses for JSON

What is pip?
pip is the package manager for Python, and it is similar to apt in Linux.

• It is used to install third-party packages. A package holds one or more Python modules that you can use in
your code.
• It is installed along with Python.
It is not called from within Python—pip is called from the command line, like Python itself is.
PYTHON FOR SYSTEM ADMINISTRATION

 System administration is the management of software and hardware systems.


 Is also known as SysAdmin.
• System administration helps to ensure increased efficiency, quick identification and resolution of problems,
and system stability.
 Includes these common tasks: • Installation of new hardware or software
• Creating and managing user accounts
• Maintaining computer systems, such as servers and databases
 Performing backups, archiving log files, automating repetitive tasks
• Planning and responding to system outages and various other problems

A better os.system(): subprocess.run()

In Python V3, the os module has been deprecated and replaced by the subprocess module.
Python can improve system administration by running code that makes complex decisions, and then calling
os.system() and subprocess.run() to manage the system.

Module deprecation:

The equivalent function to os.system() is subprocess.run().

os.system() versus subprocess.run()

• It runs in a subshell, which is usually Bash on Linux.


• Shell takes the given string and interprets the escape characters. – Example: os.system(“python –version”)

• By default, it does not use a shell. Instead, it tries to run a program with the given string as a name.
• You must pass in a list to run a command with arguments. – Example: subprocess.run(*“python”,”-version”+)

Why is subprocess.run() better than os.system()?

subprocess.run() is better than os.system() for the following reasons:

Safety Developers often pass an input string to os.system() without checking the actual commands.
This practice can be dangerous. For example, a malicious user can pass in string to delete
your files.

Separate subprocess.run() is implemented by a class that is called Popen, which is run as a separate
process process.

Additional Because subprocess.run() is really the Popen class, it has useful, new methods such as poll(),
functionality wait(), and terminate().
DEBUGGING AND TESTING

Debugging

PDB: The Python debugger

The Python debugger is activated by entering: Python –m pdb <filename>


The first line of the code runs, and then a prompt (pdb) appears.
You can now use several commands.
All of these tools are used to find errors and help developers identify the reason behind them.

Types of debugging (Static & Dynamic Analysis)

Static: • It can be done continuously in the development process


• The Python interpreter includes a level of static analysis because it reports syntax and
semantic errors. Integrated development environments (IDEs) can help identify issues
while you write the code.
• It might consider factors such as proper nesting, function calls, and code complexity.

Advantages: • You can identify the exact location of code issues.


• It has a faster turnaround time for fixes.
• Later tests have fewer issues.
• Earlier detection of bugs reduces costs.
 Better for smaller programs (logic errors can be quickly identified in small code bases).

Disadvantages: • Manual analysis is time consuming.


• Automation tools can produce false positives and false negatives.
• Automation might result in taking security for granted.
• Automation is only as good as the parameters that are used to set up the tool.

Dynamic: • It is done by analysing running applications


• Most IDEs for Python include a ”debugging” mode. In this mode the application runs and a
developer can execute code “line by line”, until a particular line of code is called, or until a
variable has a value (equal, less than, greater than).
• An invaluable technique during dynamic analysis is to write out values and conditions
happening in a running application of a log file.
Advantages: • It identifies issues in a runtime environment (it could be a test server or a live version of
the software).
• You can analyse applications even when you cannot observe the actual code.
• It can prove or confirm the false negatives that you identified from static analysis.
• It can be used with every application.

Disadvantages: • Automation might result in taking security for granted.


• You cannot guarantee full coverage.
• It can be more difficult to isolate code issues.

Assertions

• Assertions are conditions, such as if statements, that check the values in the application.
• Dynamic analysis uses assertion statements during runtime to raise errors when conditions certain
conditions occur.

Log monitoring

Developers, in code, writes to a text file often referred to as a log file. As certain conditions happen in the
running application, the logging code writes information to the log file.

Log monitoring:  Keeps track of errors in a running program


 Keeps records of the last time the program was run (to see what went wrong)

With log monitoring you get a full view of the application as it is running. The application can be exercised by a
user and the developer can see inspect the log file for a “real world” use of the application.

What to log?

Consider every significant event in your application: •Where did it occur?


•What time did it happen?
•What were the arguments?
•What is the state of important resources?

Capture all information when an error occurs: •All arguments


•Exception, plus inner exceptions
•Traceback object: Stack traces

Log monitoring tools

In Python, the default, native log monitoring tool is the library logging.
Software Testing

Unit tests

A unit is the smallest testable part of any software. It is the most basic level of testing. A unit usually has only
one or a few inputs with a single output.
An example of a unit testis verifying each individual function of a program. Developers are responsible for their
own unit testing.

Integration tests

Individual units are combined and tested as a group. You test the interaction between the different parts of
the software so that you can identify issues.
Analogy: When a pen is manufactured, all the pieces of the pen (units) are produced separately—the cap, the
body, the ink cartridge, and so on. All components are tested individually (unit testing). When more than one
unit is ready, you can test them to see how they interact with each other.
A developer can perform integration tests, but dedicated testers are frequently used to do these tests.

System tests

A complete and integrated application is tested. This level determines whether the software meets specific
requirements.
Analogy, continued: When the pen is assembled, testing is performed to see if the pen works. Does it write in
the correct colour? Is it the specified size?

Acceptance testing

Acceptance testing is formalized testing that considers user needs, business needs, and whether the software
is acceptable for delivery to the final user.
DEVOPS AND CONTINUOUS INTEGRATION

DevOps
DevOps is a software engineering culture and practice that aims to unify
software development (Dev) and software operation (Ops).

The main characteristic of the DevOps movement is to advocate for


automation and monitoring at all steps of software construction. These
steps range from integration, testing, and releasing to deployment and
infrastructure management.

Goals of DevOps

• DevOps is meant to bridge the gaps between traditional IT, software development, and quality assurance
(QA).
– The most difficult part for beginners is the QA part. The appearance of your code is important.

• DevOps is meant to be faster and more flexible.


– The challenge is better integration of QA and security in these quicker, shorter cycles

• DevOps is meant to bridge or reduce the need for specialized individual work.
– When you begin to develop, you might notice that it is easy to become immersed in your own work.
DevOps is meant to make it easier to do development work in a team.

Stakeholders: Organizational culture types


Continuous integration and continuous delivery (CI/CD)

Automation
The goal of automation is creative efficiency. However, automation has several risks that can undermine this
goal:

Automation: Risks

Over-automation: happens when you automate steps in the development process so that it reduces creativity.
If you must think about and consider specific steps in a different way each time that you do them, you
probably should not automate them—for example, analysing, planning, and designing.

Under-automation: occurs when you avoid automation to make sure that things are handled correctly, or
because it is helpful to find exactly where code stops working. Processes that are good to automate include
building, testing, and deploying.

Bad automation: happens when you automate a process that does not work well. Bad automation can be
fixed by revisiting the planning stage of development.

Tools for DevOps: Automation

Continuous integration (CI)


CI is the automation of making your code available to your teammates. It generally includes build automation
and quality assurance automation.

CI has two main purposes:


Continuous delivery (CD)
CD is the extension of CI. CD includes a test automation for all code that is submitted.
Its purpose is to ensure that the code works –
– In the way that it was intended
– In a way that makes sense.

CD also ensures that at any point in the development process, a working version of the code can be produced
immediately. – This part is the deployment automation.

With CI/CD, development teams can make code changes to the main branch, while ensuring that they do not
affect any changes that other developers make.
CONFIGURATION MANAGE MENT

Project infrastructure

Project infrastructure is the way that a project is organized. An architect organizes the infrastructure of a
bridge. Software developers organize the infrastructure of code.
Project infrastructure is a critical discipline for helping to ensure that projects reach their goals. It also includes
ensuring that Python code is styled properly and that it functions as expected.

Code organization

Tools and templates

For style: Utilities like pylint can be run to ensure that code blocks are indented correctly, and to fix the code
blocks that are not formatted well.

For logic: Utilities like pytest can be used to run tests to make sure that code changes still meet the
requirements.

Software configuration management

• Tracks versions of the code as it is developed


• Enables developers to work independently on different parts of a project, and then merge changes back into
the project
• Version control software (such as Git) tracks what code changed and who made the changes
• When errors occur, configuration management enables fast rollbacks to previous, functioning versions
How does configuration management work?

• Developers check out code from a repository like AWS CodeCommit or GitHub.
• When they are finished with the code, developers upload their changes to the repository.
• When the new code passes all tests, it can be merged back into the main project.
•The process of checking code in and out can be done by:
– Running Git from the command line
– Using tools that are built into integrated development environments (IDEs), such as PyCharm
•Running tools — such as pylint and pytest — can also be a part of the check-in process.

Configuration management versioning

• As developers update the code, the release managers who are responsible for distributing new versions of
software can monitor the changes.
• After all tests and functionality are verified, the release manager creates a new distribution of the software
based on the contents of the repository.
• Release managers can now version the software, which helps manage customer issues when problems occur.
(This point is where rolling back to a previous version becomes important.)
•In most cases, versioning takes the form of a numeric value (for example: Version 3.2.1).
AWS re/start

Databases
INTRODUCTION TO DATABASES

Data and databases

What is data?
• Data is raw bits and pieces of information.
– Images, words, and phone numbers are examples of data.

What is a database?
• A database is a collection of data that is organized into files that are called tables.
– Tables are a logical way of accessing, managing, and updating data.

Data models
• A data model represents the logical structure of the data that is stored in a database.
• Data models are used to determine how data can be stored and organized.
• The following items are examples of data models: – Relational
– Semi-structured
– Entity relationship
– Object-based

Schema
• A database schema defines the organization of a database.
• It is based on the data model.
• It describes the elements of a database’s design: – Tables
– Columns
– Relationships
– Constraints

Relational databases
• A relational database is a collection of data items that have predefined relationships between them.
• A relational database is often referred to as a structured query language (SQL) database.
• The database requires a fixed definition of the structure of the data.
• The data is stored in tables with rows and columns.

Main reasons to use a relational database:


• Natively supports SQL
•Provides data integrity
• Supports transactions

Use cases:
• Ecommerce
• Customer relationship management (CRM): Managing interactions with customers
• Business intelligence (BI) tools: Finance reporting and data analysis

Example relational databases: • MySQL


• Amazon Aurora
• PostgreSQL
• Microsoft SQL Server
• Oracle
Nonrelational databases
• A nonrelational database is a database that does not follow the relational model.
• A nonrelational database does not require a fixed definition of the structure of the data.
• A nonrelational database, which is often referred to as a NoSQL database, is a database that does not use a
table structure to store data (JSON or XML formats).
• These types of databases are optimized specifically for applications that require large data volumes, low
latency, and flexible data models.

Use cases:
• Fraud detection
• Internet of Things (IoT)
• Social networks

Example NoSQL databases: • Amazon DynamoDB


• MongoDB
• Apache Hbase

Pros and cons of relational and nonrelational databases

Relational databases (SQL)

Pros: • Known and reliable technology


• Simple-to-write complex queries
• Well-known SQL language
• Well-supported transactions

Cons: • Use vertical scaling


• Include a fixed schema

Nonrelational databases (NoSQL)

Pros: • Flexible schema


• Good fit for storing and fast retrieval of massive amounts of data of different types
• Horizontal scaling
• Good fit for hierarchical data

Cons: • Are a relatively new technology


• Do not guarantee data integrity
• Are not a good fit for complex queries or transactional applications

DBMS
• A DBMS is software or database as a service (DBaaS) that provides database functionality.

• It is used mainly for the following: – Creating databases


– Inserting data into a database
– Storing, retrieving, updating, or deleting data in a database

• The primary benefit of a DBaaS is to avoid the cost of installing and maintaining servers.
The following are two variations of DBMSs:

• Single-user DBMS applications, such as Microsoft Access


• Multiple-user DMBS applications, such as Oracle Database, Microsoft SQL Server, MySQL, and IBM Db2

Locations

DBaaS

A few key points about cloud-based databases


•Hosted by third-party providers:
– These database servers are hosted in third-party data centres and accessed over the internet (the
cloud) instead of being hosted on local networks.

•Reduced cost:
– These databases reduce the cost of installing and maintaining servers.

•Fully managed:
– For example, with managed AWS databases, you don’t need to manage database management
tasks, such as server provisioning, patching, setup, configuration, backups, or recovery.

•Faster:
– With these databases, you can use companies, such as AWS, that offer large amounts of storage and
processing power in their data centres

DBaaS examples

Amazon Relational Database Service (Amazon RDS)


Amazon RDS manages common relational database administrative tasks.

Amazon Aurora
A part of Amazon RDS, Aurora is a fully managed relational database engine.

Amazon Dynamo
DBDynamoDB is a fully managed NoSQL database service.
DATA INTERACTION AND DATABASE TRANSACTION

Roles interacting with relational databases

• Application developer
– Creates applications that populate and manipulate the data within a database according to the application’s
functional requirements
• End user
– Uses reports that are created from the information within the database
– Typically accesses a database through a client-server application or a three-tier web application
• Data analyst
– Collects, cleans, and interprets data within a database system
• Database administrator
– Designs, implements, administers, and monitors data in database systems
– Ensures consistency, quality, and security of the database

Data interaction models

Types of database interaction: • Client-server


• Three-tier web application

Client-server interaction model:

1. Users use computers and devices that run client applications, which use SQL to request data.
2. The applications use SQL that is sent to the server over a network to communicate with the
database.
3. The server runs a database management system, which receives the requests, processes the
SQL, and returns the response.

Three-tier web application interaction:

1. The user uses a client computer or device that runs a web browser. A webpage
that is running in the web browser captures the user’s input and sends a request to
the web server.
2. The web server gathers the information on the webpage and forwards the
request to the application server for processing.
3. A web application component that is running on the application server receives
the request. It contains the SQL commands to access the database to satisfy the
request. The component sends the commands to the database server.
4. The DBMS that runs on the database server receives and processes the SQL
commands. The DBMS returns the results to the application server.
5. The web application component on the application server processes the results
and returns them to the web server.
6. The web server formats the results into a webpage.
7. The web browser on the client device displays the webpage that contains the SQL
results to the user.
Embedded SQL in application code

• In both interaction models, an application contains the SQL commands that the user requires.
• An application developer embeds SQL statements in the application code so that the application can perform
database tasks.
• The application is installed on a user computer or an application server.

Transactions in databases
A transaction is a collection of changes made to a database that must be performed as a unit.
A transaction is also called a logical unit of work. In other words, either all of its operations succeed, and the
transaction succeeds, or if one or more operations fail, then the entire transaction fails.
At the database level, either all the database changes related to the transaction are performed, or no change is
made to the database at all. It’s an all or nothing modification.

Status of a transaction

• Active state: In the initial state of every transaction and when the transaction is being run, the status is
active.
• Partially committed: A transaction is in a partially committed state when it is completing its final operation.
• Failed state: A transaction is in a failed state when any checks made by the database recovery system fail.
• Aborted state: An aborted transaction occurs if the transaction is in a failed state, and the database rolls
back to its original state before running the transaction.
• Committed state: When all of the operations within a transaction have been successfully performed, the
transaction is considered committed.

Transactions use cases

You can use transactions to do the following:

• Run a set of operations so that the database never contains the result of partial operations.
– If one operation fails, the database is restored to its original state.
– If no errors occur, the full set of statements changes the database.

• Provide isolation between programs that access a database simultaneously.


– If this isolation does not happen, the outcomes could prove to be incorrect.
Without transactions, if two or more requests attempt to change the same data in a database simultaneously,
the order and effect of the changes are unpredictable. As a result, the database might end up in a corrupted
state. Transactions provide a mechanism, called isolation, which ensures that the simultaneous change
requests are processed one at a time and do not interfere with each other.

Properties of transactions
Transactions follow four standard properties — atomicity, consistency, isolation, and durability — which are
known as ACID.

Atomicity ensures that changes are successfully completed all at once or not at all.
Consistency ensures that any changes will not violate the integrity of the database, including any constraints.
Isolation keeps all transactions in isolation. Transactions are isolated so that they do not interfere with the
other transactions.
Durability ensures that as soon as a transaction is committed, the change is permanent.
CREATING TABLES AND LEARNING DIFFERENT D ATA TYPES

SQL
SQL (pronounced SEE-kwell) is the language that is used for querying and manipulating data. SQL is also used
for defining structures in databases. SQL is a standard programming language for relational databases.

What can SQL do?

• SQL can perform many of the necessary actions on a database.

What are SQL sublanguage groups?

• Data manipulation language (DML)


– You can use DML to view, add, change, or delete data in a table.
• Data definition language (DDL)
– You can use DDL to define and maintain the objects in your database (its schema). The schema
includes the tables, columns, and data types of your database.
• Data control language (DCL)
– DCL statements control access to the data in a database.

DML

Description
• Views, changes, and manipulates data in a table
• Includes commands to select, update, and insert data into a table, and to delete data from a table
• Is typically used by data analysts, report authors, and programmers who write client applications

Statements
SELECT: Retrieves data from a table
INSERT: Adds rows to a table
UPDATE: Modifies rows in a table
DELETE: Deletes rows from a table

DDL

Description
• Creates and defines the database and the objects in it
• Includes commands to create and delete tables
• Is typically used by database administrators and programmers

Statements
CREATE: Creates a database or a table
ALTER TABLE: Adds, deletes, or modifies columns in a table; also adds or deletes constraints
DROP: Deletes a database object, such as a table or a constraint
DCL

Description
• Controls access to the data in a database
• Includes commands to grant or revoke database permissions
• Is typically used by database administrators and programmers

Statements
REVOKE: Revokes permissions from a database user
GRANT: Grants permissions to a database user

Basic SQL elements

Predefined data types


Predefined data types are also known as built-in data types.

The following table contains examples of commonly used SQL built-in data types:

SQL Data Type Description Example


INT Represents an integer 120000
CHAR Represents a fixed-length character string 'United States of America'
FLOAT Represents a floating point number 3.1415
DATETIME Represents a date and time combination '2022-07-18 16:48:12'

Identifiers

• Identifiers represent the names of the objects that the user creates, in contrast to language keywords or
statements.
• As a recommended practice, capitalize language keywords and commands, and define identifiers in
lowercase.
• It is important to remember that different database management systems handle capitalization conventions
differently.

Different database management systems handle capitalization conventions differently. The following are some
examples:

• IBM and Oracle: When you are processing code that you write, IBM and Oracle database management
systems automatically convert identifiers to uppercase. (That is, they will ignore the case that you used.) To
retain the case that you used for your identifiers, you must enclose them in double quotation marks (" ").

• Microsoft SQLServer: Microsoft SQLServer can be configured to be case sensitive or not case sensitive, but it
is case sensitive by default. Case sensitivity is associated with the collation properties of SQL Server, which
determine the sorting rules, case, and accent sensitivity properties for your data.

• MySQL Server: MySQL Server is case sensitive by default except in Microsoft Windows.
Constraints on data
Constraints enforce limits on the type of data that can go into a table.

• NOT NULL: ensures that a column does not hold a NULL value.
• UNIQUE: requires a column or set of columns to have values that are unique to those columns.
• DEFAULT: If no value was provided for the column, DEFAULT provides a value when the DBMS inserts a row
into a table.

Reserved terms and key words

• Reserved terms are SQL keywords or symbols that have specific meanings when being processed.
• For clarity and to avoid errors, do not use reserved terms in the names of databases, tables, columns, or
other database objects.

Symbols Key Words


# ADD
; CLOSE
: DATABASE
@ EXISTING

Tables

Naming tables

Purposeful naming conventions

• Carefully select purposeful names (identifiers).

• Certain factors should drive your naming conventions. For example, for the database, tables, and columns,
consider the following factors:
– Rules and limitations that the DBMS imposes
– Naming conventions that the organization adopts
– Clarity

Some additional recommended practices when naming tables and table elements include the following:
• Use descriptive names. The name should be meaningful and should describe the entity that is being
modelled.
• Be consistent when choosing to use singular or plural table names.
• Use a consistent format in table names. For example, if you create a table to store order details, you can use
camel case (orderDetails) or underscore (order_details) for the compound table name. Whichever convention
you select, use it consistently.

Primary keys (PKs) and foreign keys (FKs)

Primary key
A primary key is a special column in a table that has a unique value for each row and uniquely identifies the
row.

Foreign key
A foreign key is a special column in a table that holds the primary key value from another table. A foreign key
creates a relationship between the two tables.

A table can have zero or one primary key (PK). The PK can consist of one column or multiple columns
(compound PK). The PK of one table can be defined as a foreign key (FK) of another table to establish a
relationship between the two tables.

Referential integrity
A database quality where every non-NULL foreign key value matches an existing, primary key value

Character string data types


This data type is described by a character string data type descriptor.

Data Type Description


• This data type is a character string with a fixed
length.
CHAR (length)
• Values for this type must be enclosed in single
quotation marks (' ')
• This data type is a variable length character string,
VARCHAR (length)
and its maximum length is fixed.
• A Character Large Object (CLOB) is a large character
string or text data that can have a length in the order
CLOB (length) of gigabytes.
• A CLOB is usually stored in a separate location that
is referenced in the table itself.

The following is some general guideline:


• Use fixed-length character data types for consistent-length data, such as a postal code, product code, or
telephone number.
• Use variable-length data types when the length of data is widely variable. Make sure that variable-length
columns are not wider than they need to be.
• Make sure that you understand how CLOB storage is allocated in the system that you use.
Numeric and date data types

Numeric types
Numeric data types represent numerical values.

Data Type Description


Represents an integer. The minimum and maximum values depend on
INTEGER the DBMS.
Example: 102030
Is the same as the INTEGER type except that it might hold a smaller
SMALLINT range of values. The range depends on the DBMS.
Example: 10
Is the same as the INTEGER type except that it might hold a larger
BIGINT range of values. The range depends on the DBMS.
Example: 98765432101
This data type represents an exact number with a precision p and a
scale of s. It is a decimal number, which is a number that can have a
DECIMAL (p, s) decimal point in it.
Example: Decimal(10,3) could have 1234567 or 1234567.123 as valid
entries.
This data type is a floating point number with a precision of p.
FLOAT (p) Precision is greater than or equal to one, and the maximum precision
depends on the DBMS.
This data type is the same as the FLOAT type, except that the DBMS
REAL
defines the precision.

The INTEGER, SMALLINT, and BIGINT data types represent whole numbers that can be positive, negative, or
zero. They are exact numeric data types.

When choosing which integer data type to use for a numeric column, select the type with the smallest range
that is enough to accommodate the values that will be stored. In other words, do not over-allocate and waste
storage.

Note that the INTEGER data type can also be abbreviated as INT.

A DECIMAL data type represents an exact fixed-point number. It has two arguments: precision and scale.
Precision defines the total number of digits in the number. Scale defines the number of digits after the decimal
point. The scale cannot exceed the precision. An example use case for a DECIMAL is to store monetary values.

The FLOAT and REAL data types represent approximate numbers. They are stored more efficiently and can
generally be processed faster than DECIMAL values. They work well for scientific calculations that must be fast
but not necessarily exact to a digit.

Date and time data types

Data Type Description


Represents a date
DATE
Example: yyyy-mm-dd
Represents a time of day without the time zone
TIME
Example: hh:mm:ss
Represents a moment in time indicated by a date and a time
TIMESTAMP
Example: yyyy-mm-dd hh:mm:ss
INSERTING DATA INTO A DATABASE

What is a .csv file?


A .csv file is a simple text file in which information is separated by commas.

• .csv files can be opened in any program that works with plain text.
• .csv files have the following format: –Each line contains the same sequence of data.
–Each data point is separated by a comma.
• These files are most commonly used to import or export data in databases and spreadsheets.

Importing data and exporting data

Importing a CSV

• Verify that the .csv file has data that matches the number of columns of the table and the type of data in
each column.
• Create a table in MySQL with a table name that corresponds to the .csv file that you want to import.
• Import by using the LOAD DATA statement.
–If the first row of the file contains column headers, use the IGNORE 1 ROWS clause to ignore the first row.
–If the rows in the file are terminated by a newline character, use the TERMINATED BY '\n' clause to indicate
so.

This statement imports data from the temporary file into the city table:

Exporting a CSV

This statement exports data from the city table and places it into the temporary city.csv file:

Cleaning data
As changes are made to databases over time, issues can arise due to disorganization or errors in the data. Data
should be cleaned for a number of reasons, but the following list contains the main reasons:

• Increased productivity
• Improved data quality
• Fewer errors
To combat these issues, data can be cleaned by using the following SQL string functions:
• LEFT, RIGHT, and TRIM: Use these functions to select only certain elements of strings and remove certain
characters.
• CONCAT: Combine strings from several columns and put them together.
• LOWER: Force every character in a string to be lowercase.
• UPPER: Force every character in a string to be uppercase.

DESCRIBE statement
The DESCRIBE statement provides a description of the specified table or view. Usually, tables have more than
one column.

INSERT statement

INSERT INTO statement


• This statement is fundamental to populating a database table with data.
• The SQL INSERT INTO statement is used to insert a single record or multiple records into a table.
• The SQL INSERT INTO statement is referred to as a data manipulation language (DML) command.
• The order of the columns is important.

• When you insert a row, you must specify a column where the data will go.
• When you insert values, you must enclose the values with single quotation marks (' ') for character or date
values.

Syntax for the INSERT statement

Name Description
tableName The table where data will be inserted
col_1, col_2, col_3, ... Each column of the table where the data is going
val_1, val_2, val_3, ... Values against each column

Syntax for inserting data into all columns of the table

Syntax for inserting data into specific columns of the table


Remember the following in the syntax for the INSERT INTO statement:
• Column represents the field titles.
• Values represent the data that is being inserted into the fields.

First, specify the table name and a list of comma-separated columns inside parentheses after the INSERT INTO
clause. Then, put a comma-separated list of values of the corresponding columns inside the parentheses that
follow the VALUES keyword.

You can use the INSERT INTO statement in two ways. In the first example, when you add values for all the
columns of the table, you do not need to specify the column names. In the second example, both the column
name and the values are written. The number of columns and values must be the same. In addition, the
positions of columns must correspond to the positions of their values.

NULL statement
NULL statements are used as placeholders or to represent a missing value to improve readability. They also
clarify the meaning and actions of conditional statements.
The INSERT statement can insert a NULL value into a column. You can insert a NULL value into an int column
with a condition that the column must not have NOT NULL constraints.

Key points about the NULL statement


SELECTING DATA FROM A DATABASE

SELECT statement

You use the SELECT statement to select one or more columns from a table. You can also use the SELECT
statement when you want to access a subset of rows, columns, or both. When you query tables, you must
include the FROM clause in your syntax. The result of the SELECT statement is called a result set. It lists rows
that contain the same number of columns.

How it works

SQL SELECT statement syntax structure

The syntax for selecting data follows a precise order. The required clauses must precede the optional clauses.
The first clause contains SELECT and the column names, and the FROM clause with the table name
immediately follows it.
All optional clauses will follow these first two required clauses.

SELECT statement considerations

Considerations
• Enclose literal strings, text, and literal dates with single quotation marks (' ').
• As a best practice to improve readability, capitalize SQL keywords (for example, SELECT, FROM, and WHERE).
• Depending on the database engine or configuration, data values that you provide in conditions might be case
sensitive.
Different ways to SELECT columns

Basics
• The clause is followed by the item or items being acted on. In this example, SELECT is followed by the column
names.
• Brackets ([ ]) enclose optional parameters.
• With the SELECT clause, you must specify one or more columns or use an asterisk (*) to request all columns.

Optional clauses

Optional clauses of the SELECT statement: - WHERE


- GROUP BY
- HAVING
- ORDER BY

WHERE
Request only certain rows from a table.

In SQL, you can use the WHERE clause to apply a filter that selects only certain rows from a table. In a SELECT
statement, the WHERE clause is optional. The SELECT-FROM-WHERE block can be useful for locating certain
information in rows. You could use this construct if you needed a list of all the cities that are located within a
country.

GROUP BY
Use a column identifier to organize the data in the result set into groups.

Here, the SELECT statement selects the rows from the country table, groups the rows by continent, and counts
the number of rows in each group. The result is a listing of the number of countries in each continent.
HAVING
Use with GROUP BY to specify which groups to include in results.

The HAVING clause filters the results of a GROUP BY clause in a SELECT statement. In this example, the query
selects only the continents that have more than one country after the rows in the table are grouped by
continent.

ORDER BY
Sort query results by one or more columns and in ascending or descending order.

Use the ORDER BY clause to sort query results by one or more columns and in ascending or descending order.
If the items in the table are needed in a specific order of importance, you might need to order the results in
ascending or descending order.

Comment Syntax
Comments begin with specific characters to denote that they are to be ignored and not run.

Single-line comment
• This type of comment begins with a double dash (--).
• Any text between the double dash and the end of the line will be
ignored and not performed.

Inline comment
• This type of comment begins with a double dash (--).
• This comment is similar to the single-line comment in that any text
between the double dash and the end of the line will be ignored
and not performed. This comment differs in that it is preceded by
syntax within the same line, which is not ignored.

Multiple-line comment
• This type of comment begins with /*and ends with */
• Any text between the /*and */will be ignored.
PERFORMING A CONDITIONAL SEARCH

Overview of search conditions


A search condition is a logical test that can be applied to a row. You can use conditions in a query to filter your
search results.

• Use SELECT statements to retrieve data from database tables.


• Adding a WHERE clause to a SELECT statement limits the data that is retrieved from a table.
• The WHERE clause applies a search condition to each row of the SELECT statement.
• The search condition uses operators to specify the data that the query includes.

Types of operators

Three types of operators are used in search conditions:


• Arithmetic operators perform mathematical operations.
• Comparison operators compare values between data items.
• Logical operators are used to build compound conditions.

These operators can be used in SELECT, INSERT, UPDATE, and DELETE statements.

Arithmetic operators

SQL operation SQL operator


Addition +
Subtraction -
Multiplication *
Division /
Modulus (remainder) %

Comparison operators

SQL operation Operation Description


= Equals Compares two data items to see whether they are equal
!=,<> Not equal Compares two data items to see whether they are not equal
Compares two data items to see whether the value of the data
< Less than
item on the left is less than the value on the right
Compares two data items to see whether the value of the data
<= Less than or equal
item on the left is less than or equal to the value on the right
Compares two data items to see whether the value of the data
> Greater than
item on the left is greater than the value on the right
Compares two data items to see whether the value of the data
>= Greater than or equal
item on the left is greater than or equal to the value on the right
Logical operators

SQL operator Description


Joins two or more conditions in a WHERE clause. All conditions must be true for data items
AND
to be affected by the SQL statement.
Joins two or more conditions in a WHERE clause. At least one of the conditions must be true
OR
for data items to be affected by the SQL statement.
Used for matching on multiple data items in a single WHERE clause by using a list of
IN
conditional values.
Used for matching on multiple data items in a single WHERE clause by using partially
LIKE
matching conditional values referred to as wildcards (denoted by _or %).
Used for matching on multiple data items in a single WHERE clause by specifying a range on
BETWEEN
matching conditional values.
NOT Is used to reverse the effect of IN, LIKE, and BETWEEN operators.

Operator precedence
SQL operators are evaluated in a defined order in a SQL statement.

Aliases

An alias is used to assign a temporary name to a table or column within a SQL query.
• The alias exists only while the SQL statement is running.
• Aliases are useful for assigning names to obscurely named tables and columns.
• Aliases are also used to assign meaningful names to query output that uses arithmetic SQL operators.
• Aliases can be specified in a SQL statement by using the optional AS keyword.
• If spaces are desired in an alias name, the alias should be defined in quotation marks.

You can use aliases to include a column header of your choosing in a query result set.
In some situations, aliases can make your SQL statements simpler to write and easier to read.

NULL values

Databases use NULL to represent the absence of value for a data item.
• Because they have no value, NULL values cannot be compared to each other by using typical comparison
operators.
• Because they have no value, NULL values are not equal to one another.
• Use IS NULL and IS NOT NULL when working with NULL values in a WHERE clause.
• Tables can be designed so that NULL values are not allowed.
WORKING WITH FUNCTIONS

Built-in functions

Some common functions include aggregate functions, conversion functions, date functions, string functions,
mathematical functions, and control flow and window functions.

Aggregate functions

Aggregate Function Use Case and Example


• Returns the average of a set
AVG
• Can be used to find the average population for cities within a specified country
• Returns the number of items in a set
COUNT
• Can be used to find the total number of cities listed within a specified country
• Returns the maximum value in a set
MAX
• Can be used to find the city with the greatest number or the highest population
• Returns the minimum value in a set
MIN
• Can be used to find the city with the smallest number or the lowest population
• Returns the total of all values in a set
SUM • Can be used to find the total population for all of the cities that are listed for a
specified country
ORGANIZING DATA

Organizing data by using SQL

• Sorting is the practice of organizing the sequence of the data returned by a query so that the data can be
analysed effectively.
• Structured query language (SQL) statements use the ORDER BY clause to sort query output in a specified
order.
• Query output can be sorted in either ascending or descending order.
• SQL statements use the GROUP BY clause to combine query output into groups.
• SQL statements use the HAVING clause to apply filter conditions to aggregated group data.

Sorting and ORDER BY keyword


Use the ORDER BY clause to sort data in a specific column in ascending or descending order by using the
keyword ASC or DESC.

Grouping and filtering data


You can use the GROUP BY clause in a SQL statement to group data items of the same value together.
The GROUP BY clause is typically used in conjunction with SELECT statements that include SQL aggregation
functions such as COUNT, MAX, MIN, SUM, and AVG.
The GROUP BY clause groups the query results together by using the specified aggregation function.

Using GROUP BY items with filter conditions


SQL statement WHERE clauses are evaluated before the GROUP BY clause.
The HAVING clause is used to filter query results after applying the GROUP BY clause.
The HAVING clause will include the same column used in the aggregation function of the SELECT clause.

Adding the HAVING clause as filter condition


The HAVING clause in a SQL statement is used with the GROUP BY clause to add a filter condition based on the
aggregated value.
RETRIEVING DATA FROM MULTIPLE TABLES

Set operators
Set operators are used to combine the results of multiple queries into a single result set. You can use different
tables to compare or unite the results into one result set. Queries that contain set operations are referred to
as compound queries.

Set operator Use


Used to combine two or more result sets into a single set
UNION
(without duplicates)
Used to combine two or more result sets into a single set
UNION ALL
(including duplicates)
Used to combine two result sets and return the data that is
INTERSECT
common in both of the result sets
Used to combine two result sets and return the data from the first result set that is not
MINUS
present in the second result set

UNION operator example

You can use the UNION operator to combine the results of two or more SELECT statements into a single result
set. Using UNION without the ALL operator will remove duplicate rows from the resulting set. The keyword
ALL lists duplicate rows and displays them in the result set.

JOINs
JOIN clauses (inner, left, right, and full) are used to combine rows from two or more tables.

JOIN clauses Use


INNER JOIN Return the rows that match in both tables.
LEFT JOIN Return all rows from the left table.
RIGHT JOIN Return all rows from the right table.
FULL JOIN Return all the rows from both tables.
How JOIN clauses work

• INNER JOIN: This JOIN returns only the overlapping data between the two tables.
• LEFT JOIN: This JOIN returns the overlapping data between the two tables and the non-matching data from
the left table.
• RIGHT JOIN: This JOIN is the opposite of LEFT JOIN. It returns the overlapping data between the two tables
and the non-matching data from the right table.
• FULL JOIN: This JOIN returns the overlapping data between the two tables and the non-matching data from
both the left and right tables.

The critical thing to remember is that JOINs are clauses in SQL that link two tables together. A JOIN is usually
based on the key or common value that defines the relationship between those two tables.

You can use a SELF JOIN to join a table to itself by using either a LEFT JOIN or an INNER JOIN.
AMAZON RELATIONAL DATABASE SERVICE (AMAZON RDS)

Amazon RDS is a managed database service that sets up and operates a relational database in the cloud.
Running an unmanaged, standalone relational database can be time-consuming and have limited scope. To
address these challenges, AWS provides a service that sets up, operates, and scales the relational database
without any on-going administration.
Amazon RDS provides cost-efficient and resizable capacity while automating time-consuming administrative
tasks.
Amazon RDS frees you to focus on your applications so that you can give them the performance, high
availability, security, and compatibility that they need. With Amazon RDS, your primary focus is your data and
optimizing your application.

Amazon RDS use cases

Web and mobile applications Ecommerce applications Mobile and online games
•High throughput •Low-cost database •Rapid growth capacity
•Massive storage scalability •Data security •Automatic scaling
•High availability •Fully managed solution •Database monitoring

DB instance
A DB instance is an isolated database environment that runs in the cloud. It is the basic building block of
Amazon RDS.

Mechanics of Amazon RDS DB instances

Amazon RDS backup

Automatic: Creates automated backups (data and transaction logs) of DB instances during the backup window.
Manual: Creates storage volume snapshots of your DB instances.
Aurora
Aurora is a relational database engine.

Benefits of Aurora

• Aurora uses the same code, tools, and applications as existing MYSQL and PostgreSQL databases.
• Aurora includes a high-performance storage subsystem. Its database engine is customized to take advantage
of that fast distributed storage.
• An Aurora DB has clusters that consist of one or more DB instances and a cluster volume that manages the
data for those DB instances.

Aurora DB cluster
An Aurora DB cluster consists of one or more DB instances and a cluster volume that manages the data for
those DB instances.

Aurora cluster volume


An Aurora cluster volume is a virtual database storage volume that spans multiple Availability Zones. Each
Availability Zone has a copy of the DB cluster data.

Contents of an Aurora DB cluster


Two types of DB instances make up an Aurora DB cluster.

Primary DB instance
The primary DB is the main instance. The instance allows read and write operations and allows for data
modification.

Aurora Replica
The Aurora Replica connects to the same storage volume as the primary DB instance and supports only read
operations.

Aurora use cases

Enterprise applications
Compared to commercial databases, Aurora can help cut down your database costs by 90 percent or more
while improving the database’s reliability and availability.

Software as a service (SaaS) applications


The Aurora managed database offering provides benefits to SaaS applications. SaaS companies can focus on
building high-quality applications without worrying about the underlying database that powers the application.

Online and mobile gaming


Because web and mobile games are built to operate at very large scale, they require a database with high
throughput and massive storage scalability.
AMAZON DYNAMODB

Relational and nonrelational NoSQL databases: Comparison

Relational databases Nonrelational NoSQL databases

•Data is stored in tables by using predefined • Data is stored in tables with a flexible column
columns of specific data types. structure.
•Relationships can be defined between tables •Each item stored in a table can have a different
by using table foreign keys. number and type of data elements.
•Better performance is achieved by adding • Better performance is achieved by adding a
compute or memory capacity to a single new server to an existing pool of database
database server. servers.

DynamoDB is a NoSQL database


A fully managed, serverless, key-value NoSQL database that does the following:

• Improves performance by keeping data in memory


• Keeps data secure by encrypting data at rest
• Protects data with backups and automated copying of data between AWS Regions

DynamoDB offers the following advantages:


• Fully managed service: AWS manages all of the underlying compute and storage needed to store your data.
• Scalability: DynamoDB automatically adds more compute and storage capacity as your data grows.
• Redundancy: DynamoDB copies your data across multiple AWS Regions to avoid data loss.
• Recoverability: DynamoDB can restore your data from automatic backup operations.
• Low latency: Data can be read from a DynamoDB table in a few milliseconds.
• Security: You can use DynamoDB with AWS Identity and Access Management (IAM) to control access to
DynamoDB tables.
• Flexibility: DynamoDB can store several types of data, and each record can have varying numbers and types
of data. JSON is one popular way to store data in DynamoDB.

Key concepts: Tables

• Tables in DynamoDB: Similar to relational database systems, DynamoDB stores data in tables.
– The table name and primary key must be specified at table creation.
– Each DynamoDB table has at least one column that acts as the primary key.
• The primary key is the only data that is required when storing a row in a DynamoDB table. Any other data is
optional.

Key concepts: Attributes

• A column in a DynamoDB table is called an attribute.


– An attribute represents a fundamental data element, something that does not need to be broken
down any further.
– An attribute is similar in concept to table columns or fields in a relational database.
• A primary key can consist of one or two attributes.
Key concepts: Items

• An item is a group of attributes that is uniquely identifiable among all of the other items.
– Each item consists of one or more attributes.–Each item is uniquely identified by its primary key
attribute.
– This concept is similar to a table row in a relational database.

Difference from a relational database


• The number and type of non-primary key attributes can be different for each item in a table.

Key concepts: Primary keys

Simple primary key


• The primary key consists of one attribute.
• The attribute is called the partition key or hash key.

Composite primary key


• The primary key consists of two attributes.
• The first attribute is the partition key or hash key.
• The second attribute is the sort key or range attribute.

REMEMBER: Primary keys uniquely identify each item in the table, so no two items can have the same primary
key.

Key concepts: Partitions

• DynamoDB tables store item data in partitions.


– Table data is partitioned and indexed by the primary key.
– Each table has one or more partitions.
– New partitions are added as data is added to the table.
– Tables whose data is distributed evenly across multiple partitions generally deliver the best
performance.
• The partition in which an item’s data is stored is determined by its primary key attribute.

DynamoDB global tables


A global table is a collection of one or more DynamoDB tables, which must all be owned by a single AWS
account.

• Using the global table option creates a DynamoDB table that is automatically replicated across your choice of
AWS Regions worldwide.
– Deliver fast, local read and write performance for global applications.
– Your applications can stay highly available in the unlikely event of isolation or degradation of an
entire AWS Region.
• Global tables eliminate the difficult work of replicating data between Regions and resolving update conflicts
between copies.

;
AWS re/start

AWS Architecture
AWS CLOUD ADOPTION FRAMEWORK (AWS CAF)

The AWS CAF provides guidance and best practices to help organizations build a comprehensive approach to
cloud computing across the organization.
• The AWS CAF guidance also helps organizations throughout the IT lifecycle to accelerate successful cloud
adoption.
• The AWS CAF is organized into perspectives (sets of business or technology capabilities that are the
responsibility of key stakeholders).
• Perspectives consist of sets of capabilities.

For any organization to successfully migrate its IT portfolio to the cloud, three elements—people, process, and
technology—must be aligned. The AWS CAF provides guidance to support a successful migration to the cloud.

Core perspectives

Business perspective

• IT finance
• IT strategy
• Benefits realization
• Business risk management

Stakeholders from the Business perspective include business managers, finance managers, budget owners, and
strategy stakeholders. They can use the AWS CAF to create a strong business case for cloud adoption and
prioritize cloud adoption initiatives. Stakeholders should ensure that an organization’s business strategies and
goals align with its IT strategies and goals.

People perspective

• Resource management
• Incentive management
• Career management
• Training management
• Organizational change management
Stakeholders from the People perspective include human resources, staffing, and people managers. They can
use the AWS CAF to evaluate organizational structures and roles, assess new skill and process requirements,
and identify gaps. Performing an analysis of needs and gaps can help prioritize training, staffing, and
organizational changes to build an agile organization.

Governance perspective

• Portfolio management
• Program and project management
• Business performance measurement
• License management

Stakeholders from the Governance perspective include the chief information officer (CIO), program managers,
enterprise architects, business analysts, and portfolio managers. They can use the AWS CAF to focus on the
skills and processes needed to align IT strategy and goals with business strategy and goals. This focus helps the
organization maximize the business value of its IT investment and minimize the business risks.

Platform perspective

• Compute provisioning
• Network provisioning
• Storage provisioning
• Database provisioning
• Systems and solution architecture
• Application development

Stakeholders from the Platform perspective include the chief technology officer (CTO), IT managers, and
solutions architects. They use a variety of architectural dimensions and models to understand and
communicate the nature of IT systems and their relationships. They must be able to describe the architecture
of the target state environment in detail. The AWS CAF includes principles and patterns for implementing new
solutions on the cloud and for migrating on-premises workloads to the cloud.

Security perspective

• Identity and access management


• Detective control
• Infrastructure security
• Data protection
• Incident response

Stakeholders from the Security perspective include the chief information security officer (CISO), IT security
managers, and IT security analysts. They must ensure that the organization meets security objectives for
visibility, auditability, control, and agility. Security perspective stakeholders can use the AWS CAF to structure
the selection and implementation of security controls that meet the organization’s needs.
Operations perspective

• Service monitoring
• Application performance monitoring
• Resource inventory management
• Release management or change management
• Reporting and analytics
• Business continuity or disaster recovery (DR)
• IT service catalogue

Stakeholders from the Operations perspective (for example, IT operations managers and IT support managers)
define how day-to-day, quarter-to-quarter, and year-to-year business is conducted. Stakeholders from the
Operations perspective align with and support the operations of the business. The AWS CAF helps these
stakeholders define current operating procedures. It also helps them identify the process changes and training
that are needed to implement successful cloud adoption.
AWS WELL-ARCHITECTED FRAMEWORK

The Well-Architected Framework describes key concepts, design principles, and architectural best practices for
designing and running workloads in the AWS Cloud.

Cloud architects should use the Well-Architected Framework to do the following:


• Increase awareness of architectural best practices.
• Address foundational areas that are often neglected.
• Evaluate architectures by using a consistent set of principles.

Features
The Well-Architected Framework provides a set of foundational questions that help you to understand
whether a specific architecture aligns well with cloud best practices. It also includes information about services
and solutions that are relevant to each question and references to relevant resources.

Provides
• Questions that are centred on critically understanding architectural decisions
• Domain-specific lenses
• Hands-on labs
• AWS Well-Architected Tool
• AWS Well-Architected Partner Program

Does not provide


• Implementation details
• Architectural patterns

Pillars of the Well-Architected Framework

The Well-Architected Framework helps you design your architecture from different perspectives, or pillars. The
pillars are operational excellence, security, reliability, performance efficiency, cost optimization, and
sustainability. Each pillar contains a set of design principles and best practices.
Operational excellence

The ability to monitor systems to do the following:


• Deliver business value.
• Continually improve supporting processes and procedures.

Key topics:
• Manage and automate changes.
• Respond to events.
• Define standards to manage daily operations.

This pillar includes how your organization supports your business objectives and your ability to run workloads
effectively. It also includes how your organization supports your ability to gain insight into their operations and
to continuously improve supporting processes and procedures to deliver business value.
An example of an operational excellence best practice is to continuously monitor the health and performance
of your workloads using a service such as Amazon CloudWatch. You can use this service to initiate automated
responses to adjust the resources that your workloads use and to prevent performance issues or failures.

Operational excellence design principles:


Perform operations as code: In the cloud, you can apply the same engineering discipline that you use for
application code to your entire environment. You can define your entire workload (applications, infrastructure,
etc.) as code and update it with code. You can script your operations procedures and automate their process
by launching them in response to events. By performing operations as code, you limit human error and create
consistent responses to events.

Make frequent, small, reversible changes: Design workloads that are scaleable and loosely coupled to permit
components to be updated regularly. Automated deployment techniques together with smaller, incremental
changes reduces the blast radius and allows for faster reversal when failures occur. This increases confidence
to deliver beneficial changes to your workload while maintaining quality and adapting quickly to changes in
market conditions.

Refine operations procedures frequently: As you evolve your workloads, evolve your operations
appropriately. As you use operations procedures, look for opportunities to improve them. Hold regular reviews
and validate that all procedures are effective and that teams are familiar with them. Where gaps are
identified, update procedures accordingly. Communicate procedural updates to all stakeholders and teams.
Gamify your operations to share best practices and educate teams.

Anticipate failure: Perform “pre-mortem” exercises to identify potential sources of failure so that they can be
removed or mitigated. Test your failure scenarios and validate your understanding of their impact. Test your
response procedures to ensure they are effective and that teams are familiar with their process. Set up regular
game days to test workload and team responses to simulated events.

Learn from all operational failures: Drive improvement through lessons learned from all operational events
and failures. Share what is learned across teams and through the entire organization.
Security

The ability to do the following:


• Monitor and protect information, systems, and assets.
• Deliver business value through risk assessments and mitigation strategies.

Key topics:
• Identify and manage who can do what.
• Establish controls to detect security events.
• Protect systems and services.
• Protect the confidentiality and integrity of data.

The security pillar involves the ability to monitor and protect systems while delivering business value through
risk assessments and mitigation strategies. An example of security in the cloud would be staying up to date
with AWS and industry recommendations and threat intelligence. Automation can be used for security
processes, testing, and validation to scale security operations.

Security design principles:


Implement a strong identity foundation: Implement the principle of least privilege and enforce separation of
duties with appropriate authorization for each interaction with your AWS resources. Centralize identity
management, and aim to eliminate reliance on long-term static credentials.

Maintain traceability: Monitor, alert, and audit actions and changes to your environment in real time.
Integrate log and metric collection with systems to automatically investigate and take action.

Apply security at all layers: Apply a defence in depth approach with multiple security controls. Apply to all
layers (for example, edge of network, VPC, load balancing, every instance and compute service, operating
system, application, and code).

Automate security best practices: Automated software-based security mechanisms improve your ability to
securely scale more rapidly and cost-effectively. Create secure architectures, including the implementation of
controls that are defined and managed as code in version-controlled templates.

Protect data in transit and at rest: Classify your data into sensitivity levels and use mechanisms, such as
encryption, tokenization, and access control where appropriate.

Keep people away from data: Use mechanisms and tools to reduce or eliminate the need for direct access or
manual processing of data. This reduces the risk of mishandling or modification and human error when
handling sensitive data.

Prepare for security events: Prepare for an incident by having incident management and investigation policy
and processes that align to your organizational requirements. Run incident response simulations and use tools
with automation to increase your speed for detection, investigation, and recovery.
Reliability

The ability of a system to do the following:


• Recover from infrastructure or service failures.
• Dynamically acquire computing resources to meet demand.
• Mitigate disruptions, such as the following: • Misconfigurations
• Transient network issues

The reliability pillar encompasses the ability of a workload to perform its intended function correctly and
consistently when it is expected to. This ability includes operating and testing the workload through its total
lifecycle.

Reliability in the cloud comprises four areas:


• Foundations: To achieve reliability, your architecture and system must have a well-planned foundation that
can handle changes in demand or with requirements. It also must be able to detect failure and automatically
heal itself.
• Architecture: Before architecting any sort of structure, it is critical to look at the foundation. Before
architecting any system, foundational requirements that influence reliability should be in place. The workload
architecture of the distributed system must be designed to prevent and mitigate failures.
• Change management: With change management, it is important to fully understand how change can affect
your system. If you plan proactively and monitor your systems, you can accommodate change and adjust to it
quickly and reliably.
• Failure management: To make sure that your architecture is reliable, it is key to anticipate, become aware
of, respond to, and prevent failures from happening. In a cloud environment, you can take advantage of
automation with monitoring, replacing systems in your environment, and later troubleshooting failed systems.
This automation is all at a lower cost and is still reliable.

Reliability design principles:


Automatically recover from failure: By monitoring a workload for key performance indicators (KPIs), you can
run automation when a threshold is breached. These KPIs should be a measure of business value, not of the
technical aspects of the operation of the service. This allows for automatic notification and tracking of failures,
and for automated recovery processes that work around or repair the failure. With more sophisticated
automation, it’s possible to anticipate and remediate failures before they occur.
Test recovery procedures: In an on-premises environment, testing is often conducted to prove that the
workload works in a particular scenario. Testing is not typically used to validate recovery strategies. In the
cloud, you can test how your workload fails, and you can validate your recovery procedures. You can use
automation to simulate different failures or to recreate scenarios that led to failures before. This approach
exposes failure pathways that you can test and fix before a real failure scenario occurs, thus reducing risk.
Scale horizontally to increase aggregate workload availability: Replace one large resource with multiple small
resources to reduce the impact of a single failure on the overall workload. Distribute requests across multiple,
smaller resources to ensure that they don’t share a common point of failure.
Stop guessing capacity: A common cause of failure in on-premises workloads is resource saturation, when the
demands placed on a workload exceed the capacity of that workload (this is often the objective of denial of
service attacks). In the cloud, you can monitor demand and workload utilization, and automate the addition or
removal of resources to maintain the optimal level to satisfy demand without over- or under-provisioning.
There are still limits, but some quotas can be controlled and others can be managed (see Manage Service
Quotas and Constraints).
Manage change through automation: Changes to your infrastructure should be made using automation. The
changes that need to be managed include changes to the automation, which then can be tracked and
reviewed
Performance efficiency

The ability to do the following:


• Use computing resources efficiently to meet system requirements.
• Maintain that efficiency as demand changes and technologies evolve.

The performance efficiency pillar refers to using computing resources efficiently while meeting system
requirements. At the same time, it is important to maintain that efficiency as demand fluctuates and
technologies evolve. To implement performance efficiency, take a data-driven approach to building a high-
performance architecture. Gather data on all aspects of the architecture from the high-level design to the
selection and configuration of resource types.

Reviewing your choices on a regular basis helps ensure that you are taking advantage of the continually
evolving AWS Cloud. Monitoring helps ensure that you are aware of any deviance from expected performance.
Make trade-offs in your architecture to improve performance, such as using compression or caching, or
relaxing consistency requirements.

Factors that influence performance efficiency in the cloud include the following:
• Selection: It is important to choose the best solution that will optimize your architecture. Solutions vary
based on the kind of workload that you have, and you can use AWS to customize your solutions in many
different ways and configurations.
• Review: You can continually innovate your solutions and take advantage of the newer technologies and
approaches that become available. Any of these newer releases could improve the performance efficiency of
your architecture.
• Monitoring: After you implement your architecture, you must monitor performance to help ensure that you
can remediate any issues before customers are affected and aware of them. With AWS, you can use
automation and monitor your architecture with tools such as Amazon CloudWatch, Amazon Kinesis, Amazon
Simple Queue Service (Amazon SQS), and AWS Lambda.
• Trade-offs: An example of a trade-off that helps ensure an optimal approach is trading consistency,
durability, and space against time or latency to deliver higher performance.

Performance efficiency design principles:


Democratize advanced technologies: Make advanced technology implementation easier for your team by
delegating complex tasks to your cloud vendor. Rather than asking your IT team to learn about hosting and
running a new technology, consider consuming the technology as a service. For example, NoSQL databases,
media transcoding, and machine learning are all technologies that require specialized expertise. In the cloud,
these technologies become services that your team can consume, allowing your team to focus on product
development rather than resource provisioning and management.

Go global in minutes: Deploying your workload in multiple AWS Regions around the world allows you to
provide lower latency and a better experience for your customers at minimal cost.

Use serverless architectures: Serverless architectures remove the need for you to run and maintain physical
servers for traditional compute activities. For example, serverless storage services can act as static websites
(removing the need for web servers) and event services can host code. This removes the operational burden of
managing physical servers, and can lower transactional costs because managed services operate at cloud scale.

Experiment more often: With virtual and automatable resources, you can quickly carry out comparative
testing using different types of instances, storage, or configurations.

Consider mechanical sympathy: Use the technology approach that aligns best with your goals. For example,
consider data access patterns when you select database or storage for your workload.
Cost optimization

The ability to avoid or eliminate the following:


• Unneeded costs
• Suboptimal resources

Cost optimization refers to the ability to avoid or eliminate unneeded expenses and resources. It is a continual
process of refinement and improvement over the span of a workload’s lifecycle.

Cost optimization in the cloud has five focus areas:


• Practice cloud financial management.
• Be aware of expenditure and usage.
• Maintain cost-effective resources.
• Manage demand and supply resources.
• Optimize over time.

Similar to the other pillars within the Well-Architected Framework, cost optimization has trade-offs to
consider. For example, you want to consider whether to optimize for speed-to-market or for cost. In some
cases, it’s best to optimize for speed—going to market quickly, shipping new features, or meeting a deadline—
rather than investing in upfront cost optimization.

Cost optimization design principles:


Implement cloud financial management: To achieve financial success and accelerate business value
realization in the cloud, you must invest in Cloud Financial Management. Your organization must dedicate the
necessary time and resources for building capability in this new domain of technology and usage management.
Similar to your Security or Operations capability, you need to build capability through knowledge building,
programs, resources, and processes to help you become a cost efficient organization.

Adopt a consumption model: Pay only for the computing resources you consume, and increase or decrease
usage depending on business requirements. For example, development and test environments are typically
only used for eight hours a day during the work week. You can stop these resources when they’re not in use
for a potential cost savings of 75% (40 hours versus 168 hours).

Measure overall efficiency: Measure the business output of the workload and the costs associated with
delivery. Use this data to understand the gains you make from increasing output, increasing functionality, and
reducing cost.

Stop spending money on undifferentiated heavy lifting: AWS does the heavy lifting of data centre operations
like racking, stacking, and powering servers. It also removes the operational burden of managing operating
systems and applications with managed services. This allows you to focus on your customers and business
projects rather than on IT infrastructure.

Analyse and attribute expenditure: The cloud makes it easier to accurately identify the cost and usage of
workloads, which then allows transparent attribution of IT costs to revenue streams and individual workload
owners. This helps measure return on investment (ROI) and gives workload owners an opportunity to optimize
their resources and reduce costs.
Sustainability

The ability to minimize the following:


• Impact of workloads on the environment
• Carbon emissions
• Energy consumptions
• Waste

The discipline of sustainability addresses the long-term environmental, economic, and societal impact of your
business activities. When building cloud workloads, the practice of sustainability includes the following:
• Understanding the impacts of the services used
• Quantifying impacts through the entire workload lifecycle
• Applying design principles and best practices to reduce these impacts

This pillar focuses on environmental impacts, especially energy consumption and efficiency. They are
important levers for architects to inform direct action to reduce resource usage.

You can use the AWS Cloud to run workloads designed to support your wider sustainability challenges.
Examples of these challenges include reducing carbon emissions, lowering energy consumption, recycling
water, or reducing waste in other areas of your business or organization.

Sustainability design principles:


Understand your impact: Measure the impact of your cloud workload and model the future impact of your
workload. Include all sources of impact, including impacts resulting from customer use of your products, and
impacts resulting from their eventual decommissioning and retirement. Compare the productive output with
the total impact of your cloud workloads by reviewing the resources and emissions required per unit of work.
Use this data to establish key performance indicators (KPIs), evaluate ways to improve productivity while
reducing impact, and estimate the impact of proposed changes over time.

Establish sustainability goals: For each cloud workload, establish long-term sustainability goals such as
reducing the compute and storage resources required per transaction. Model the return on investment of
sustainability improvements for existing workloads, and give owners the resources they need to invest in
sustainability goals. Plan for growth, and architect your workloads so that growth results in reduced impact
intensity measured against an appropriate unit, such as per user or per transaction. Goals help you support the
wider sustainability goals of your business or organization, identify regressions, and prioritize areas of
potential improvement.

Maximize utilization: Right-size workloads and implement efficient design to ensure high utilization and
maximize the energy efficiency of the underlying hardware. Two hosts running at 30% utilization are less
efficient than one host running at 60% due to baseline power consumption per host. At the same time,
eliminate or minimize idle resources, processing, and storage to reduce the total energy required to power
your workload.

Anticipate and adopt new, more efficient hardware and software offerings: Support the upstream
improvements your partners and suppliers make to help you reduce the impact of your cloud workloads.
Continually monitor and evaluate new, more efficient hardware and software offerings. Design for flexibility to
allow for the rapid adoption of new efficient technologies.

Use managed services: Sharing services across a broad customer base helps maximize resource utilization,
which reduces the amount of infrastructure needed to support cloud workloads. For example, customers can
share the impact of common data centre components like power and networking by migrating workloads to
the AWS Cloud and adopting managed services, such as AWS Fargate for serverless containers, where AWS
operates at scale and is responsible for their efficient operation. Use managed services that can help minimize
your impact, such as automatically moving infrequently accessed data to cold storage with Amazon S3
Lifecycle configurations or Amazon EC2 Auto Scaling to adjust capacity to meet demand.

Reduce the downstream impact of your cloud workloads: Reduce the amount of energy or resources required
to use your services. Reduce or eliminate the need for customers to upgrade their devices to use your services.
Test using device farms to understand expected impact and test with customers to understand the actual
impact from using your services.
WELL-ARCHITECTED PRINCIPLES

The AWS Well-Architected Framework identifies a set of general design principles to facilitate good design in
the cloud:

Stop guessing your capacity needs

Test systems at production scale

Automate to make architectural experimentation easier


Provide for evolutionary architectures

Drive architectures by using data

Improve through game days


RELIABILITY AND HIGH AVAILABILITY

Reliability

• Is the probability that an entire system functions for a specified period of time
• Includes hardware, firmware, and software
• Measures how long the item performs its intended function

Two common measures of reliability

• Mean time between failure (MTBF): Total time in service divided by the number of failures
• Failure rate: Number of failures divided by the total time in service

Availability
Availability is a measure of the percentage of time that a resource is operating normally.

• Availability is a percentage of uptime (such as 99.9 percent) over a period of time (commonly a year).
• Availability is equal to the normal operation time divided by the total time.
• There is common shorthand: • This shorthand refers to only the number of 9s.
• For example, five 9s is 99.999 percent available.

High availability (HA)


High availability is about ensuring that your application's downtime is minimized as much as possible without
the need for human intervention.

HA is meant to help ensure the following:

• Systems are generally functioning and accessible.


• Downtime is minimized.
• Minimal human intervention is required.
HA: Prime factors

The following factors contribute to HA:


• Fault tolerance: The built-in redundancy of an application's components and its ability to remain operational
• Scalability: The ability of an application to accommodate growth without changing design
• Recoverability: The process, policies, and procedures related to restoring service after a catastrophic event

On-premises HA compared to HA on AWS

Traditional, or on-premises IT
In traditional, on-premises IT, HA is the following:

• Expensive
• Suitable for only mission-critical applications

AWS Cloud
AWS expands availability and recoverability options by providing the ability to use the following:

• Multiple servers
• Isolated redundant data centres within each Availability Zone
• Multiple Availability Zones within each AWS Region
• Multiple Regions around the world
• Fault-tolerant services
TRANSITIONING A DATA CENTER TO THE CLOUD

A traditional on-premises infrastructure (or corporate data center) might include a setup that is similar to this
example. This diagram represents a three-tier, client-server architecture in a corporate data center. The box
labelled Corporate Data Center indicates what is contained in the data center.

The bottom of this diagram includes the database servers with attached tape backup devices. This tier is
responsible for the database logic.

The middle of the diagram contains the application servers. An application server is a component-based
product that resides in the middle tier of a server-centric architecture. It provides middleware services for
security and state maintenance and also provides data access and persistence. The application servers also
contain the business logic. The middle section also contains network-attached storage (NAS). NAS devices are
file servers that provide a centralized location for users on a network to store, access, edit, and share files.

The web servers are located at the top of the diagram. The web servers are responsible for the presentation
logic. They are accompanied by load balancers. Load balancers are responsible for efficiently distributing
incoming network traffic across a group of backend servers.

The Microsoft Active Directory or Lightweight Directory Access Protocol (LDAP) server is like a phone book that
anyone can use to locate organizations, individuals, and other resources (such as files and devices in a
network) on the public internet or on a corporate intranet.

The box labelled Storage Area Network (SAN) with the attached external disks refers to storage that is outside
the corporate data center. A SAN is a specialized, high-speed network that provides block-level network access
to storage. SANs are often used to improve application availability (for example, multiple data paths). SANs are
also used to enhance application performance (for example, off-load storage functions, separate networks,
and so on).
You could replace a traditional on-premises or corporate data center with the following in the AWS Cloud:

• You can replace servers, such as the on-premises web servers and app servers, with Amazon Elastic Compute
Cloud (Amazon EC2) instances that run all the same software. Because EC2 instances can run a variety of
Microsoft Windows Server, Red Hat, SuSE, Ubuntu, or Amazon Linux operating systems, you can run many
server applications on EC2 instances.

• You can replace the LDAP server with AWS Directory Service, which supports LDAP authentication. With
Directory Service, you can set up and run Microsoft Active Directory in the cloud or connect your AWS
resources with existing on-premises Microsoft Active Directory.

• You can replace software-based load balancers with Elastic Load Balancing (ELB) load balancers. ELB is a fully
managed load balancing solution that scales automatically as needed. It can perform health checks on
attached resources and redistribute a load away from unhealthy resources as necessary.

• Amazon Elastic Block Store (Amazon EBS) is a storage service to use with Amazon EC2. You can replace SAN
solutions with EBS volumes. You can attach these volumes to application servers to store data long-term and
share the data between instances.

• You can use Amazon Elastic File System (Amazon EFS) to replace your NAS file server. Amazon EFS is a file
storage service for EC2 instances. It offers a user-friendly interface that you can use to create and configure file
systems. It also grows and shrinks your storage automatically as you add and remove files so that you always
use exactly the amount of storage that you need. Another solution could be to run an NAS solution on an EC2
instance.

• You can replace databases with Amazon Relational Database Service (Amazon RDS). With this service, you
can run Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle, and Microsoft SQL Server on a platform that is
managed by AWS.

• Finally, you can automatically back up RDS instances to Amazon Simple Storage Service (Amazon S3). Using
Amazon S3 replaces the need for on-premises, database backup hardware. Amazon S3 provides object storage
through a web service interface. Objects can be up to 5 GB, and you can turn on versioning for your objects.
After transitioning to the AWS Cloud, the example data center might look like this diagram.

The ELB load balancer distributes traffic to the web servers that are now located on EC2 instances. The LDAP
server is now Directory Service. ELB has replaced software-based load balancers and distributes traffic to the
servers, which are now EC2 instances. Amazon EBS has replaced SAN solutions. Amazon EFS has replaced the
NAS file server, and Amazon RDS has replaced the databases.
AWS re/start

Systems
Operations
SYSTEMS OPERATIONS ON AWS

Systems operations (SysOps) is concerned with the deployment, administration, and monitoring of systems
and network resources in an automatable and reusable manner.

SysOps contains critical tasks that keep many companies running today. SysOps supports technical systems by
monitoring them and helping ensure that their performance meets expectations and is trouble-free. SysOps
typically requires understanding a system's entire environment.

Benefits of SysOps include the following:


• Ability to configure and manage thousands of servers and devices in a repeatable way
• Reduction in errors by replacing manual processes with automated ones
• Real-time visibility into the state of the infrastructure through monitoring

Systems operations: Responsibilities

SysOps professionals are involved in many—and often all—facets of delivering IT solutions.

SysOps involves the responsibilities and tasks required to build (create), test, deploy, monitor, maintain, and
safeguard complex computing systems.

Examples of these tasks include the following:


• Build: Create separate environments for development, test, and production.
• Test: Test backup and disaster recovery procedures.
• Deploy: Deploy applications and workloads into their runtime environment.
• Monitor: Monitor the health and performance of infrastructure resources.
• Maintain: Apply patches and upgrades in a consistent and regular manner.
• Safeguard: Apply and enforce security measures in all the infrastructure layers.

SysOps professionals typically use automation because of the large size of the infrastructure.
Systems operations in the cloud

• Cloud computing provides organizations the ability to automate the development, testing, and deployment
of complex IT operations.

• Automation of SysOps provides the following:


• Repeatable deployment of infrastructure and applications on demand
• Creation of self-describing systems
• Ability to build well-tested, secure systems
AWS IDENTITY AND ACCESS MANAGEMENT (IAM) REVIEW

IAM

With IAM, you can do the following:

• Centrally manage authentication and access to Amazon Web Services (AWS) resources.
• Create users, groups, and roles.
• Apply policies to them to control their access to AWS resources.

Use IAM to configure authentication, which is the first step, because it controls who can access AWS
resources. IAM can also be used to authenticate resources. For example, applications and other AWS services
use it for access.

IAM is used to configure authorization, which is based on knowing who the user is. Authorization controls
which resources users can access and what they can do to or with those resources. IAM reduces the need to
share passwords or access keys when you grant access rights. It also makes it easy to turn on or turn off a
user’s access over time and as appropriate.

Access to AWS services

Programmatic access Console access

• Authenticates by using an access key ID • Uses account ID or alias, IAM user name, and
and a secret access key password
• Provides access to APIs, AWS Command Line • Prompts the user for an authentication code if
Interface (AWS CLI), SDKs, and other development multi-factor authentication (MFA)is turned on
tools

Security credentials, IAM users, and IAM roles

Types of security credentials

Types of Credentials Description


Email address and password Associated with an AWS account (root user)
IAM user name and password Used for accessing the AWS Management Console
Typically used with the AWS CLI and programmatic requests,
Access keys
such as through APIs and SDKs
Serves as an extra layer of security
MFA
Can be turned on for account root user and IAM users
Key pairs Used for only specific AWS services, such as Amazon EC2
Policies and permissions

1. Identity-based policies allow a user to attach managed and inline policies to IAM identities, such as users or
the groups that users belong to. A user can also attach identity-based policies to roles. Identity-based policies
are defined and stored as JSON documents.

2. Resource-based policies allow a user to attach inline policies to resources. The most common examples of
resource-based policies are Amazon Simple Storage Service (Amazon S3) bucket policies and IAM role trust
policies. Resource-based policies are JSON policy documents.

3. AWS Organizations service control policies (SCPs) apply permissions boundaries to AWS Organizations,
organizational units (OUs), or accounts. SCPs use the JSON format.

4. Access control lists (ACLs) can also be used to control which principals (that is, users or resources) can
access a resource. ACLs are similar to resource-based policies although they are the only policy type that does
not use the JSON policy document structure.

Best practices

1. Avoid using the account root user credentials for daily administration. Instead, when you set up a new AWS
account, define at least one new IAM user. Then, grant the user or users access so that they can do most daily
tasks by using these IAM user credentials.

2. Delegate administrative functions by following the principle of least privilege. Grant access to only services
that are needed, and limit permissions within those services to only the parts that are needed. You can always
grant additional rights over time if the need arises.

3. Use IAM roles to provide cross-account access. Other best practices for IAM mentioned earlier in this lesson
include configuring strong password policies, turning on MFA for any privileged users, and rotating credentials
regularly.

4. MFA is a good practice to implement for security purposes.


AWS COMMAND LINE INTERFACE (AWS CLI)

Three ways to use AWS

The AWS Management Console provides a rich graphical interface for a majority
of the products and services that AWS offers. Occasionally, new features might
not have all of their capabilities available through the console when the feature
initially launches.

The AWS CLI provides a suite of utilities that can be run from a command
program in Linux, macOS, or Microsoft Windows.

The software development kits (SDKs) are packages that AWS provides. The
SDKs provide access to AWS services by using popular programming
languages, such as Python, Ruby, .NET, or Java. The SDKs make it
straightforward to use AWS in existing applications. You can also use them
to create applications to deploy and monitor complex systems entirely
through code.

AWS CLI

• The AWS CLI is available for Linux, Microsoft Windows, and macOS.
• After installing the AWS CLI, you can use the aws configure command to specify default settings.

AWS CLI output


Install the AWS CLI

Download and extract the AWS CLI package

1. Use the curl command. The -o option specifies the file name that the downloaded package is written to.

The option in the example command will write the downloaded file to the current directory with the local
name awscliv2.zip.

2. Use the unzip command to extract the installer package. If unzip is not available, use an equivalent program.

The command extracts the package and creates a directory named aws under the current directory.

Run the install program and verify the installation

3. Next, run the install program. By default, the files are all installed to /usr/local/aws-cli, and a symbolic link is
created in /usr/local/bin.

The command includes sudo to give you write permissions to those directories.

4. Use the following command to confirm the installation.

This command displays the version of the AWS CLI and its software dependencies that you just installed.

The following is the expected result:


Using the AWS CLI

Command-line format

The command line format can be broken down into several parts:

1. Use the aws command to invoke the AWS CLI program.


2. Specify the top-level service to be called. For example, to invoke Amazon EC2, you enter the ec2 command
as shown in the example.
3. Specify the operation to be called. This operation will be performed on the service specified in step 2, such
as stop-instances or run-instances.
4. Specify the parameters. One or more parameters can be specified, depending on the operation that is
called. Parameters are arguments or details that are used to perform the operation. Some operations have
required parameters while other operations do not require parameters. Most operations also offer a few or
many optional parameters. In the first example, the stop-instances operation requires an instance-ids
parameter. (Otherwise, it would be unclear which instance or instances should be stopped.) Parameter names
are preceded by two dashes (--). You can also put all of the parameters for an operation in a JSON file and
specify the file name as a single parameter.
5. Finally, specify the options. For example, after calling aws ec2 stop-instances, use the --output option to
specify the format of the response.

AWS CLI help


You can use the help command to access the list of available AWS CLI commands and see examples of their
syntax.

Query option
Use the --query option to limit fields displayed in the result set.

Show only the first Amazon EC2 instance in list:

Show the name of the state of the first instance:


Filter option
The --filter option is used to restrict the result set filtered on the server side.

Show only Microsoft Windows instances:

Dry-run option
The --dry-run option:
• This option checks for required permissions without making a request.
• It also provides an error response if unauthorized.

Common AWS CLI commands

• Commands based on Amazon EC2:


• aws ec2 run-instances: This command launches the specified number of instances from an AMI.
• aws ec2 describe-instances: This command describes any instances that exist in the account.
• aws ec2 create-volume: This command creates an EBS volume that can be attached to an instance
in the same Availability Zone.
• aws ec2 create-vpc: This command creates a VPC with the Classless Inter-Domain Routing (CIDR)
block specified.
• Commands based on Amazon S3:
• aws s3 ls: This command lists Amazon S3 objects and common prefixes under a prefix or all S3
buckets. Optionally, if you specify a specific bucket in the account, and the ls command will list the
contents of the specified bucket.
• aws s3 cp: This command copies a file to, from, or between Amazon S3 locations. Use it to copy
local files to Amazon S3, files from Amazon S3 to a laptop, or Amazon S3 files to other Amazon S3
locations.
• aws s3 mv: This command moves a local file or an Amazon S3 object to another location locally or in
Amazon S3.
• aws s3 rm: This command deletes an Amazon S3 object.
AWS re/start

Tooling &
Automation
AWS SYSTEMS MANAGER

Systems Manager is a collection of capabilities that help you manage your applications and infrastructure
running in the AWS Cloud.

Capabilities overview
Documents
A Systems Manager document defines the actions that Systems Manager performs on your managed
instances.

Automation
Safely automate common and repetitive IT operations and management tasks across AWS resources.

Run Command
The Systems Manager Run Command provides an automated way to run predefined commands against EC2
instances.

• Use predefined commands.


• Create your own.
• Choose instances or tags.
• Choose controls or schedules.
• Run a command immediately or on a specific
schedule.
Session Manager
Securely connect to instances without opening inbound ports, using bastion hosts, or maintaining SSH keys.

Patch Manager
Deploy operating system and software patches automatically across large groups of EC2 instances or on-
premises machines.

Maintenance Windows
Schedule windows of time to run administrative and maintenance tasks across your instances.
State Manager
Maintain consistent configuration of Amazon EC2 or on-premises instances.

Parameter Store
Parameter Store provides a centralized store to manage configuration data or secrets.

Inventory
The Inventory capability collects information about instances and the software that is installed on them.
ADMINISTRATION TOOLS

Software development kits (SDKs)


You can use SDKs to access AWS services programmatically and write administrative scripts in different
programming languages.

•.NET •C++ •Go •Java •JavaScript •Node.js •PHP •Python •Ruby

AWS CloudFormation
You can use CloudFormation to create, update, and delete a set of AWS resources as a single unit.

• You define the resources in a template, which can be written in JSON or YAML.
• CloudFormation provisions the resources defined in a template as a single unit called a stack.
• Key features of CloudFormation include the ability to do the following:
• Preview how proposed changes to a stack will impact the existing environment.
• Detect drift.
• Invoke an AWS Lambda function.

How CloudFormation works

Benefits of CloudFormation
CloudFormation provides the following benefits: reusability, repeatability, and maintainability.

With CloudFormation, you can do the following:


• Deploy complex environments rapidly.
• Duplicate the same environment.
• Ensure configuration consistency.
• Delete resources in a single action (delete the stack).
• Propagate the same change to all stacks (update the stacks).
AWS OpsWorks
You can use OpsWorks to automate how servers are configured, deployed, and managed.

The following are features of OpsWorks:


•It automates configuration management.
•It is based on the Chef and Puppet popular open-source automation platforms.

It is available in three versions: •AWS OpsWorks for Chef Automate


•AWS OpsWorks for Puppet Enterprise
•AWS OpsWorks Stacks
AWS re/start

Servers
HOSTING A STATIC WEBSITE ON AMAZON S3

Amazon S3 static website hosting feature

You can use Amazon S3 to host a static website.

• Amazon S3 stores the HTML, CSS, and JavaScript pages of the static website.
• Amazon S3 automatically assigns an endpoint URL that you can use to access the website.

The benefits of using the Amazon S3 website hosting feature include the following:

• You do not need to manage any infrastructure.


• The feature automatically scales to handle increasing traffic.
• The feature provides a low-cost option for hosting a static website.

Use cases

Amazon S3 static website hosting is best used for the following:

• Websites that do not contain server-side scripting


• Websites that change infrequently
• Websites that need to scale for occasional increases in traffic
• Customers who do not want to manage infrastructure

One limitation of Amazon S3 is that it can serve only HTTP requests to a website. If you need to support HTTPS,
you can use Amazon CloudFront to serve the static website hosted on Amazon S3.

How to host a static website on Amazon S3


Hosting a static website on Amazon S3 involves the following steps:
1. First, to store the content of the website, create a bucket in Amazon S3. In the diagram, the bucket name is
mybucket.
2. Then, to enable website hosting and grant public read permissions to the content of the bucket, configure
the S3 bucket.
3. Next, to upload the website content to the bucket, use the AWS Management Console or the AWS
Command Line Interface (AWS CLI).
4. You can now access the website at the endpoint URL that Amazon S3 assigns to it. The endpoint URL
includes the bucket name and the name of the Region that contains the bucket. In the diagram, the URL is
http://mybucket.s3-website-us-west-2.amazonaws.com.

Amazon S3 static website hosting characteristics


In the endpoint URL, the separator character before <Region> is either a period (.) or a dash (-):

The type of separator depends on the Region that contains the bucket. For example, if the bucket is created in
the US West (Oregon) Region, the separator character is a dash. However, if the bucket is created in the
Europe (Frankfurt) Region, the separator character is a period.

Additional static website hosting characteristics

• The S3 bucket should store the website in a folder hierarchy that reflects the content structure of the
website.
• The S3 bucket must include an index document that you define during bucket configuration. The default
name is index.html.

Using an Amazon Route 53 custom domain name


You can use a custom domain name that you registered with Amazon Route 53 to access a static website
hosted on Amazon S3.
COMPUTING ON AWS

Amazon EC2 virtualization

EC2 instances run as virtual machines on host computers that are located in AWS Availability Zones. Each
virtual machine runs an operating system (OS), such as Amazon Linux or Microsoft Windows. You can install
and run applications on the OS in each virtual machine or even run enterprise applications that span multiple
virtual machines.

The virtual machines run on top of a hypervisor layer that AWS maintains. The hypervisor is the operating
platform layer that provides the EC2 instances with access to the actual hardware that the instances need to
run. This hardware includes processors, memory, and storage. Each EC2 instance receives a particular number
of virtual CPUs for processing and an amount of memory, or RAM.

Some EC2 instances use an instance store. The instance store is also known as ephemeral storage. It is storage
that is physically attached to the host computer and provides temporary block-level storage for use with an
instance. The data in an instance store persists only during the lifetime of the instance that uses it. If an
instance reboots, data in the instance store persists. If the instance stops or terminates, data in the instance
store is lost and cannot be recovered.

EC2 instance network connectivity


EC2 instances can connect to other resources over a network. For example, many EC2 instances use Amazon
Elastic Block Store (Amazon EBS) for the boot disk and other storage needs instead of using an instance store.
You attach an EBS volume to an instance through a network connection. Amazon EBS provides persistent block
storage volumes, which means that the data will be persisted. For example, the data still persists on the
instance even when the instance is in a stopped state.

Amazon EBS optimized instances minimize input/output (I/O) contention between Amazon EBS and other
traffic from your instance, which provides better performance. I/O contention occurs when virtual machines
compete for I/O resources because there is limited network bandwidth.

EC2 instances can also connect to the internet at large, other EC2 instances, and Amazon Simple Storage
Service (Amazon S3) object storage. You can configure the degree of network access to suit your needs and to
balance accessibility needs with security requirements. Different instance types provide different levels of
network performance.

Launch an EC2 instance

The steps are as follows:


1. You start with an Amazon Machine Image (AMI), which is the template that Amazon EC2 uses to launch an
instance. AWS provides some AMIs. Other AMIs come from third-party organizations and are available in the
AWS Marketplace. You can also create your own AMI from an existing EC2 instance.
2. After you choose the AMI, you select an instance type. Amazon EC2 provides a selection of instance types
that are optimized to fit different use cases. Instance types comprise varying combinations of CPU, memory,
storage, and networking capacity.
3. If you plan to connect to the instance using Secure Shell (SSH) or Remote Desktop Protocol (RDP), you must
specify a key pair. A key pair is a set of security credentials that you use to prove your identity when
connecting to an EC2 instance. A key pair consists of a public key and a private key.
4. When you launch an instance, you can specify network placement and addressing as appropriate to secure
and provide access to the instance. All instances are deployed within a network either in EC2-Classic or in a
virtual private cloud (VPC). You can also decide whether to assign a public IP address or a Domain Name
System (DNS) address to the instance.
5. You must also assign a new or existing security group to the instance. A security group is a set of firewall
rules that controls the traffic to and from your instance. The security group defines which ports network traffic
can use.
6. Next, you specify the storage options for the instance. The storage type that the instance’s OS will boot from
can be either ephemeral storage or an EBS volume. You can also attach additional block storage volumes to
the instance.
7. If you intend to run an application on the instance that makes API calls to AWS services, you must attach an
AWS Identity and Access Management (IAM) role to the instance. You use an instance profile to pass an IAM
role to an EC2 instance.
8. Finally, you can optionally specify user data when you launch an instance. This data provides a powerful way
to automate installations and configurations on the instance when it launches.

AMI
Template that contains information used to create an EC2 instance

Components:
• Template for root volume: includes an OS, and perhaps an application server and other applications.
• Block device mapping: specifies the default EBS volumes and instance store volumes to attach to the
instance when it is launched.
• Launch permissions: control which AWS accounts can use the AMI.

Benefits:
• Repeatable
• Reusable
• Available from multiple sources

Instance types
Defines a combination of CPU, memory, storage, and networking capacity. Many instance types exist and give
you the flexibility to choose the appropriate mix of resources for your applications. Some are general purpose,
and others are designed to provide extra CPU (processing power), extra RAM (memory), or extra I/O network
performance. Instance types are grouped by categories and families. You should choose the most cost-
effective instance type that supports your workload’s requirements.
Key pairs
Amazon EC2 uses public key cryptography to encrypt and decrypt login information. Public key cryptography
uses a public key to encrypt a piece of data, such as a password. The recipient then uses a private key to
decrypt the data. The public and private keys are known as a key pair.

A key pair is necessary to log in to your instance. You need a key pair that is known and registered in the SSH
settings of the OS that you are connecting to. Typically, you specify the name of the key pair when you launch
the instance. You can create a new key pair and download it as part of the instance launch process.
Alternatively, when you launch the instance, you can specify a key pair that you already have access to. When
the instance is launched, AWS handles the process of configuring the instance to accept the key pair that you
specify. After the instance has booted and you want to connect to it, you can use the private key to connect to
the instance.

• You use a key pair to remotely connect to an instance in a secure manner.


• A key pair consists of the following: • A public key that AWS stores
• A private key file that you store
• For a Windows AMI, use the private key to obtain the administrator password that you need to log in to your
instance.
• For a Linux AMI, use the private key to securely connect to your instance using Secure Shell (SSH).

VPC
A VPC provides the networking environment for an EC2 instance.

When you launch an EC2 instance, you launch it into a network environment. Typically, you launch it into a
VPC that is created with Amazon Virtual Private Cloud (Amazon VPC). The VPC defines a virtual network in your
own logically isolated area within the AWS Cloud. You can then launch AWS resources, such as instances, into
the VPC. Your VPC closely resembles a traditional network that you might operate in your own data centre.

In the VPC, you define one or more subnets. Subnets are logical network segments within the VPC, and each
subnet exists within a single Availability Zone. Another part of the network configuration is an internet
gateway. An internet gateway is a horizontally scaled, redundant, and highly available VPC component that
handles the communication between the instances in your VPC and the internet.

A virtual private gateway is an optional component that supports virtual private network (VPN) connections.
The virtual private gateway sits on the Amazon side of the VPN connection. You create a virtual private
gateway and attach it to the VPC that you want to create the VPN connection from. The customer side of the
VPN connection has a customer gateway, which is a physical device or software application. Notice that the
diagram shows only one possible VPN solution.

Security groups are also in this network diagram. Each security group defines a set of firewall rules that allow
or block inbound and outbound traffic to or from an instance. Security groups act at the instance level, not the
subnet level. Therefore, each instance in a subnet in your VPC can be assigned to a different set of security
groups. If you do not specify a security group at launch time, the instance will be automatically assigned to the
default security group for the VPC.

Types of IP addresses

A private IP address is always assigned to each instance when it is launched. It is allocated to the instance from
the pool of private IP addresses that are available in the subnet. EC2 instances in the VPC can use private IP
addresses to communicate with each other.

A public IP address can be optionally assigned to an EC2 instance. It is generated dynamically from a pool of
available AWS public IP addresses. Clients can use the public IP address to connect to the instance from the
internet. If you stop an instance and then start it again, it receives a new public IP address. However, if you
reboot an instance, it retains the same public IP address.

An Elastic IP address is a publicly accessible IP address that is allocated from an AWS pool of public IP
addresses. An Elastic IP address can optionally be provisioned and then assigned to an EC2 instance. Elastic IP
addresses are similar to public IP addresses, except that an Elastic IP address is static. You can reassign an
Elastic IP address to another instance at any time.

The diagram shows an example of two EC2 instances in a


public subnet of a VPC. An internet client accesses
instance 1 using the instance’s public IP address of
54.183.34.127. The internet client accesses instance 2
using the instance’s Elastic IP address of 54.77.95.100.
Instance 2 accesses instance 1 using instance 1’s private IP
address of 172.31.22.16.

Security groups

Restrict access to an instance based on the following: • Port range


• IP address range
• Resource ID
• Can be associated with multiple instances
• Allow inbound and outbound data
• Can be added or modified after you launch the instance

Each instance must have at least one security group that is associated with it. Security groups are essentially
stateful firewalls that surround one or more EC2 instances to give you control over network traffic. A stateful
firewall is a firewall that monitors the full state of active network connections. You can control Internet Control
Message Protocol (ICMP), Transmission Control Protocol (TCP), and User Datagram Protocol (UDP) network
traffic that can pass to the instance. It’s important to understand that security groups are applied to specific
instances rather than at the entry point to your network.

In addition to restricting which ports traffic can flow through, you can also restrict which IP addresses that
traffic can originate from. If you set the source IP address range as 0.0.0.0/0, traffic on that port will be
allowed from any source. However, you can also specify a specific IP address or a Classless Inter-Domain
Routing (CIDR) range. Alternatively, you can allow access only from sources within the AWS Cloud that have a
specific security group assigned to them. By default, when you create a new security group in a VPC, all
outbound traffic is open.

You can assign multiple security groups to a single instance. For example, you can create an administrative
security group, which would allow traffic on TCP port 22. You can also create a database server security group,
which would allow traffic on TCP port 3306. Then, you can assign both of those security groups to one
instance. You can apply a single security group to multiple instances.

Security group rule examples

Instance profile

• You use an instance profile to attach an AWS Identity and Access Management (IAM) role to an EC2 instance.
• The role supplies temporary permissions that applications running on the instance use to authenticate when
they make calls to AWS resources.

The following are benefits of an instance profile:


• You don’t have to store credentials (access key and secret key) locally on the instance, which is a security
risk.
• Credentials are temporary and rotated automatically.
• You can use a role for multiple instances (for example, instances in an Auto Scaling group).
Instance profile example
You can use an instance profile to grant an application access to an S3 bucket.

User data

• You can pass user data to an instance to perform customization and configuration tasks when the instance
starts.
• The format of user data varies depending on the OS:
• A shell script or cloud-init directives on a Linux instance
• A batch script or a PowerShell script on a Windows instance
• By default, a user data script runs only the first time you launch an instance.

EC2 instance metadata

• Instance metadata is data about a running EC2 instance.


• Instance metadata is divided in categories, including the following: • instance-id
• instance-type
• ami-id
• public-hostname
• To retrieve instance metadata from within the instance, use the following URL:
http://169.254.169.254/latest/meta-data/
• You can query instance metadata from a user data script to retrieve properties that you can use in the script.

Best practices for EC2 instances

Instance security

• Protect the default user account (ec2-useron Linux and Administrator on Windows) because it has
administrative permissions.
• Create additional accounts for new users to access the instance.
• Create a key pair or use an existing key pair for the new user.
• For a Linux instance, add new user accounts with SSH access to the instance, and do use not
password logins.
• For a Windows instance, use Active Directory or AWS Directory Service to tightly and centrally
control user and group access.
• Apply security patches regularly.
Remote connection to an instance
Use EC2 Instance Connect or Session Manager, a capability of AWS Systems Manager, to connect to your
instances without the need to manage SSH keys.

Additional best practices

• Use the instance console screenshot capability to troubleshoot launch or remote connection problems.
• Turn on termination protection to protect an instance from accidental termination.
• Turn off source and destination check on a network address translation (NAT) instance.
MANAGING AWS EC2 INSTANCES

Lifecycle states of an EC2 instance

This diagram shows the lifecycle of an instance. The arrows show actions that you can take, and the boxes
show the state that the instance will enter after that action. An instance can be in one of the following states:

• Pending: When an instance is first launched from an Amazon Machine Image (AMI) or when you start a
stopped instance, it first enters the pending state. At this point, the instance is booted and deployed to a host
computer. The instance type that you specified at launch determines the hardware of the host computer for
your instance.
• Running: When the instance is fully booted and ready, it exits the pending state and enters the running
state. At this time, you can connect over the internet to your running instance.
• Rebooting: An instance temporarily goes into a rebooting state as a result of a reboot action. AWS
recommends that you reboot an instance by using the Amazon EC2 console, AWS Command Line Interface
(AWS CLI), or AWS SDKs instead of invoking a reboot from within the guest operating system (OS). A rebooted
instance stays on the same physical host and maintains the same public Domain Name System (DNS) name and
public IP address. If the instance has instance store volumes, they retain their data.
• Shutting-down: This state is an intermediary state between running and terminated. A terminate action on
the instance initiates this state.
• Terminated: An instance reaches the terminated state as the result of a terminate action. A terminated
instance remains visible in the Amazon EC2 console until the virtual machine is deleted. However, you can’t
connect to or recover a terminated instance.
• Stopping: Instances that are backed by Amazon Elastic Block Store (Amazon EBS) can be stopped or
hibernated. They enter the stopping state before they attain the fully stopped state.
• Stopped: A stopped instance is not billed for usage. While the instance is in the stopped state, you can
modify certain attributes of the instance (for example, the instance type). Starting a stopped instance puts it
back into the pending state, which typically moves the instance to a new host machine. You can also terminate
a stopped instance.
Instance hibernation

Hibernation stops an instance so that its memory and processes can be restored when you start it again:
• Saves the contents of the instance memory (RAM) to the EBS root volume
• Reloads the RAM contents and resumes previously running processes when the instance is started

When you restart the instance, the following actions occur:


• The EBS root volume is restored to its previous state.
• The RAM contents are reloaded.
• The processes that were previously running on the instance are resumed.
• Previously attached data volumes are reattached, and the instance retains its instance ID.

Hibernation is useful when you have an instance that you must quickly restart but that takes a long time to
warm up if you stop and start it.

Instance state characteristics


Instance design best practice
Design your instances for quick build-up and teardown because they may need to be relaunched often.

Many situations in the cloud require building a new server, including the following:
• Automatic scaling: You might have solutions that must be able to deploy new instances without human
intervention.
• Cost savings: You might decide that you do not need to keep an instance right now. However, perhaps you
do need the ability to recreate it on a short notice. Batch processing use cases typically are in this category.
• Downgrading: You might want to downgrade an instance to save on costs. For example, you can downgrade
an instance that runs on hardware that is dedicated to you, a single customer, to hardware that has shared
tenancy. Alternatively, you might want downgrade the size of an instance (for example, from t2.xlarge to
t2.large).
• Repairing impaired instances: The underlying hardware supporting an EC2 instance can fail. Booting an
instance will place the new EC2 instance on healthy infrastructure.
• Upgrading: Upgrading the OS architecture or image type might require you to launch a new instance.

Modifying an EC2 instance

Resizing an instance
To change the size of an instance, do the following: 1. Stop the instance.
2. Modify the instance’s instance type.
3. Restart the instance.

Updating an instance
You are responsible for periodically updating the OS and security of your instances.

Tools that facilitate update tasks include the following:


• Linux tools: Yellowdog Updater, Modified (YUM)
• Windows tools: Windows Update
• AWS services: • AWS Systems Manager
• AWS OpsWorks

AMI deprecation

• Consider the deprecation date of an AMI to keep your instances up to date.


• The deprecation date of public AMIs is 2 years from the AMI creation date.

Deprecating an AMI results in the following:


• Instances that were launched using the AMI before the deprecation date are not affected.
• No new instances can be launched using the AMI in the Amazon EC2 console.
• You can continue to launch instances using the AMI by using the AWS CLI, Amazon EC2 API, or the AWS SDKs.
AWS ELASTIC BEANSTALK

Elastic Beanstalk is a platform as a service (PaaS) that facilitates the quick deployment, scaling, and
management of your applications.

As a managed service, it automatically handles the following:


• Infrastructure provisioning and configuration
• Application deployment
• Load balancing
• Automatic scaling
• Health monitoring

How Elastic Beanstalk works

To use Elastic Beanstalk, you upload your code and provide information about your application. Elastic
Beanstalk automatically launches an environment and creates and configures the AWS resources to run your
application. These resources include EC2 instances, HTTP servers, and application servers. Elastic Beanstalk
runs on the Amazon Linux AMI and the Windows Server AMI.
You can deploy your code through the AWS Management Console, the AWS Command Line Interface (AWS
CLI), or an integrated development environment (IDE) such as Visual Studio or Eclipse.

Elastic Beanstalk features

• Elastic Beanstalk supports web applications written for common platforms, including Java, .NET, PHP,
Node.js, Python, Ruby, Go, and Docker.
• It gives you control over key runtime configuration options and resources, such as the following:
• EC2 instance type • Database • Amazon EC2 Auto Scaling options
• There is no charge to use the Elastic Beanstalk service itself. You pay for only the resources used by the
underlying services that store and run your applications.

Elastic Beanstalk benefits


AWS re/start

Scaling and Name


Resolution
ELASTIC LOAD BALANCING

Scaling overview

• Scaling is the ability to increase or decrease compute capacity to meet fluctuating demand.
• Scale out when demand increases.
• Scale in when capacity needs decrease.
• You can scale manually or automatically (auto scaling).

Amazon EC2 Auto Scaling helps ensure that you have the correct number of instances available to handle the
load for your application. You can specify premade selections such as the maximum, minimum, or capacity
thresholds in order to help ensure that your solution meets demand while also maintaining your limits.

Benefits of Auto Scaling

Better fault tolerance: Amazon EC2 Auto Scaling can detect when an instance is unhealthy, terminate it, and
launch an instance to replace it. You can also configure Amazon EC2 Auto Scaling to use multiple Availability
Zones. If one Availability Zone becomes unavailable, Amazon EC2 Auto Scaling can launch instances in another
one to compensate.
Better availability: Amazon EC2 Auto Scaling helps ensure that your application always has the right amount of
capacity to handle the current traffic demand.
Better performance: When traffic increases, having more instances gives you the ability to distribute and
share the work to maintain a good response time.
Better cost management: Amazon EC2 Auto Scaling can dynamically increase and decrease capacity as
needed. Because you pay for the EC2 instances you use, you save money by launching instances when they are
needed and terminating them when they aren't.

Components for scaling

Route 53 is a highly available and scalable cloud Domain Name System (DNS) web service. It is designed to give
developers and businesses a reliable way to route users to internet applications. It translates names (such as
www.example.com) into the numeric IP addresses (such as 192.0.2.1) that computers use to connect to each
other. Route 53 entries are often configured to point to an ELB load balancer.
ELB automatically distributes incoming traffic across multiple targets, such as EC2 instances, containers, and IP
addresses. ELB load balancers are often configured to point to Amazon EC2 Auto Scaling groups.

Each Amazon EC2 Auto Scaling group contains a collection of EC2 instances. These instances share similar
characteristics and are treated as a logical grouping for the purposes of scaling and management. They help
you maintain application availability and give you the ability to dynamically scale capacity up or down
automatically according to conditions that you define. Any instances launched or terminated within the Auto
Scaling group are automatically registered with the load balancer.

ELB service

ELB does the following:


• Automatically distributes your incoming traffic across multiple targets, such as EC2 instances, containers, and
IP addresses, in one or more Availability Zones
• Monitors the health of its registered targets and routes traffic to only the healthy targets
• Automatically scales your load balancer capacity in response to changes in incoming traffic

Use cases

ELB load balancers include the following key features:


High availability (HA): ELB load balancers can distribute traffic across multiple targets—including EC2
instances, containers, and IP addresses—in a single Availability Zone or multiple Availability Zones.
Health checks: You can configure ELB load balancers to detect unhealthy targets, stop sending traffic to these
targets, and then spread the load across the remaining healthy targets.
Security: You can create and manage security groups that are associated with load balancers. You can also
create an internal (non-internet-facing) load balancer.
TLS termination: The load balancers include integrated certificate management and SSL decryption. Thus, you
can centrally manage the SSL settings of the load balancer and offload CPU-intensive work from your
applications.
Layer 4 or layer 7 load balancing: You can load balance HTTP and HTTPS applications for features that are
specific to layer 7. Recall that layer 7 is the application layer in the Open Systems Interconnection (OSI) model.
You can also choose to use only layer 4 load balancing for applications that rely only on TCP. Recall that layer 4
is the transport layer in the OSI model.
Operational monitoring: ELB load balancers can work with Amazon CloudWatch metrics and request tracing.
You can use these resources to monitor application performance in real time.
Elastic load balancers

Application Load Balancer


An Application Load Balancer functions at the
application layer, the seventh layer of the OSI model.

• Path-based and host-based routing


• Native IPv6 support
• Dynamic ports
• Additional supported request protocols
• Deletion protection and request tracking
• Enhanced metrics and access logs
• Targeted health checks

Gateway Load Balancer


You should use a Gateway Load Balancer when deploying inline virtual appliances where network traffic is not
destined for the Gateway Load Balancer itself. A Gateway Load Balancer transparently passes all layer 3 traffic
through third-party virtual appliances and is invisible to the source and destination of the traffic.

• Provides layer 3 gateway and layer 4 load balancing


• Passes all layer 3 traffic through third-party virtual
appliances
• Supports IP protocols
• Provides deletion protection and request tracking
• Provides enhanced metrics and access logs
• Provides targeted health checks
Network Load Balancer
A Network Load Balancer functions at the fourth layer of the OSI model. It can handle millions of requests per
second. After the load balancer receives a connection request, it selects a target from the target group for the
default rule. It attempts to open a TCP connection to the selected target on the port specified in the listener
configuration. Network Load Balancers are optimized to handle sudden and volatile traffic patterns while using
a single static IP address per Availability Zone.
Because it handles millions of requests per second while maintaining ultra-low latencies, the Network Load
Balancer works well for applications that require extreme performance.

• Sudden and volatile traffic patterns


• Single static IP address per Availability Zone
• Good option for applications that require
extreme performance
• Visibility into HTTP responses
• Visibility into the number of healthy and
unhealthy hosts
• Metric filtering based on Availability Zones or
load balancer
ELASTIC LOAD BALANCING (ELB) LOAD BALANCERS AND LISTENERS

A load balancer works as the single point of contact for clients and serves as a traffic flag in front of your
servers. It distributes the incoming application traffic across multiple targets, such as Amazon Elastic Compute
Cloud (Amazon EC2) instances. The load balancer will maximize speed by monitoring capacity performance and
the health status of targets that are located in multiple Availability Zones.

Before you can use a chosen load balancer and benefit from its features, you must add listeners and register
your targets (or target groups).

ELB components

Load balancers can have more than one listener. This example shows two listeners:
• Each listener checks for connection requests from clients, by using the protocol and port that were
configured.
• The listener forwards requests to one or more target groups, based on the defined rules.

Rules are attached to each listener, and each rule specifies a target group, condition, and priority:
• When the condition is met, the traffic is forwarded to the target group.
• You must define a default rule for each listener, and you can add rules that specify different target groups
based on the content of the request.
• This configuration is also known as content-based routing.

Each target group routes requests to one or more registered targets, such as EC2 instances, by using the
protocol and port number that you specify:
• You can register a target with multiple target groups.
• You can configure health checks for each target group.

Health checks, which are shown as attached to each target group, are performed on all targets that are
registered to a target group that is specified in a listener rule for your load balancer.

Notice that each listener contains a default rule, and one listener contains another rule that routes requests to
a different target group. As the diagram implies, you can register a target with multiple target groups.
Listeners

• A listener is a process that defines the port and protocol that the load balancer listens on.
• Each load balancer needs at least one listener to accept traffic.
• Up to 50 listeners can be created on a load balancer.
• Routing rules are defined on listeners

Target groups

• A target group contains registered targets that provide support to resources such as the following:
• Amazon Elastic Compute Cloud (Amazon EC2) instances
• Amazon Elastic Container Service (Amazon ECS) container instances
• A single target can have multiple target group registrations.

Create a load balancer

1. Create a load balancer by using the AWS CLI.


Use the create-load-balancer command to create a load balancer. You must specify two subnets that
are not from the same Availability Zone.
2. Create a target group for the load balancer.
Use the create-target-group command to create a target group. Specify the same virtual private
cloud (VPC) that you used for your EC2 instances.
3. Register EC2 instances to the target group.
Use the register-targets command to register your instances with your target group.
4. Create a listener for the load balancer.
Use the create-listener command to create a listener for your load balancer.
5. Verify the health of the registered targets.
Optionally, verify the health of the registered targets for your target group by using the
describe-target-health command.
AMAZON EC2 AUTO SCALING

What is Amazon EC2 Auto Scaling?


Amazon EC2 Auto Scaling is a service that helps ensure application availability by automatically launching or
terminating EC2 instances based on scaling options that you define.

The following are the available scaling options: • Manual scaling


• Scheduled scaling
• Dynamic scaling
• Predictive scaling

Auto scaling concepts

Capacity: Capacity limits represent the minimum and maximum group size that you want for your auto scaling
group. The group's desired capacity represents the initial capacity of the auto scaling group at the time of
creation. For example, in the diagram, the auto scaling group has a minimum size of one instance, a desired
capacity of two instances, and a maximum size of four instances. The scaling policies that you define adjust the
number of instances within your minimum and maximum ranges.
Scaling in and out: An increase in CPU utilization outside the desired range could cause the auto scaling group
to scale out (adding two instances to the auto scaling group in the example shown). Then when the CPU
utilization decreases, the auto scaling group would scale in, potentially returning to the minimum desired
capacity by terminating instances.
Instance health: The health status of an auto scaling instance indicates whether it is healthy or unhealthy. This
notification can come from sources such as Amazon EC2, Elastic Load Balancing (ELB), or custom health checks.
When Amazon EC2 Auto Scaling detects an unhealthy instance, it terminates the instance and launches a new
one.
Termination policy: Amazon EC2 Auto Scaling uses termination policies to determine which instances it
terminates first during scale-in events.
Launch template: A launch template specifies instance configuration information. It includes the ID of the
Amazon Machine Image (AMI), the instance type, a key pair, security groups, and other parameters used to
launch EC2 instances. When auto scaling groups scale out, the new instances are launched according to the
configuration information specified in the latest version of the launch template.

Auto scaling policy

Scheduled scaling
By scaling based on a schedule, you can scale your application in response to predictable load changes.
Dynamic scaling
Dynamic scaling scales the capacity of your auto scaling group as traffic changes occur.

Types of dynamic scaling policies include the following:


Target tracking scaling: Increase or decrease the current capacity of the group based on a target value for a
specific metric.
Step scaling: Increase or decrease the current capacity of the group based on a set of scaling adjustments,
which is based on the size of the alarm breach.
Simple scaling: Increase or decrease the current capacity of the group based on a single scaling adjustment.

Predictive scaling
Predictive scaling uses machine learning models to predict your expected traffic (and Amazon EC2 usage),
including daily and weekly patterns. These predictions use data that is collected from your actual Amazon EC2
usage and data points that are drawn from your own observations. The model needs historical data from at
least 1 day to start making predictions. The model is re-evaluated every 24 hours to create a forecast for the
next 48 hours.

• Increase the capacity of your auto scaling group in advance of daily and weekly patterns in traffic flows.
• Forecast load.
• Schedule minimum capacity.

Use for the following: • Applications that have periodic spikes


• Automatically set desired metric values when used in conjunction with a target
tracking dynamic scaling policy.

Instance health
Amazon EC2 Auto Scaling periodically checks the health status of all instances within the auto scaling group to
make sure that they're running and in good condition. If Amazon EC2 Auto Scaling detects that an instance is
no longer in the running state, it is treated as an immediate failure, marks the instance as unhealthy, and
replaces it.

Types of health checks: • Amazon EC2 status checks and scheduled events (default)
• Elastic Load Balancing (ELB) health checks
• Custom health checks

Example problems that can cause an instance status check to fail:


• Incorrect networking or startup configuration
• Exhausted memory
• Corrupted file system
• Incompatible kernel

Termination policy
A termination policy specifies the criteria that Amazon EC2 Auto Scaling uses to choose an instance for
termination when scaling in. There are various predefined termination policies, including a default policy.

The default termination policy is designed to help ensure that your instances span Availability Zones evenly for
high availability. When Amazon EC2 Auto Scaling terminates instances, it first determines which Availability
Zones have the most instances, and it finds at least one instance that is not protected from scale in.
Amazon EC2 Auto Scaling provides other predefined termination policies, including the following:
OldestInstance: This policy terminates the oldest instance in the group. This option is useful when you are
upgrading the instances in the auto scaling group to a new EC2 instance type. You can gradually replace
instances of the old type with instances of the new type.
NewestInstance: This policy terminates the newest instance in the group. This policy is useful when you’re
testing a new launch template but do not want to keep it in production.
OldestLaunchTemplate: This policy terminates instances that have the oldest launch template. This choice is
good when you are updating a group and phasing out the instances from a previous template configuration.
ClosestToNextInstanceHour: This policy terminates instances that are closest to the next billing hour. Using
this policy is a good way to maximize the use of your instances and manage your Amazon EC2 usage costs.

Lifecycle hooks
Lifecycle hooks provide an opportunity to perform a user action before the completion of a scale-in or scale-
out event.

Launch Templates
A launch template specifies instance configuration information (including the ID of the AMI, the instance type,
a key pair, security groups, and other parameters) and gives you the option to have multiple versions. The
Amazon EC2 Auto Scaling group then maintains the right number of EC2 instances defined by the launch
template depending on your needs. Together, both the launch template and the auto scaling group policies
determine what to launch within the auto scaling group and how to manage them.
Best practices

Metrics and instance type configurations

Create a steady-state group


With Amazon EC2 Auto Scaling health checks, you can create a steady-state group to help ensure that a single
instance is always running. For example, suppose you have a NAT server that you do not want to be a single
point of failure in a standard public-private subnet architecture. A steady-state group is useful for this instance.

Steady-state group:
• You set an Amazon EC2 Auto Scaling group with the same min, max, and desired values.
• An instance is recreated automatically if it becomes unhealthy or if an Availability Zone fails.
• There is still potential downtime while an instance recycles.

Use case: Maintain a steady-state NAT server in each Availability Zone.

Avoid thrashing
Thrashing is the condition in which there is excessive use of a computer’s virtual memory, and the computer is
no longer able to service the resource needs of applications that run on it. When you configure automatic
scaling, make sure that you avoid thrashing. Thrashing could occur if instances are removed and added—or
added and removed—in succession too quickly.
AMAZON ROUTE 53

Route 53 is a scalable Domain Name System (DNS) web service.

With this service you can do the following: • Register or transfer a domain name.
• Resolve domain names to IP addresses.
• Connect to infrastructure.
• Distribute traffic across Regions.
• Support high availability and lower latency.

Using Route 53 with ELB


Associating a DNS name with ELB

• By default, AWS assigns a hostname to your load balancer that resolves to a set of IP addresses.
• Assign your own hostname by using an alias resource record set.
• Create a Canonical Name Record (CNAME) that points to your load balancer.

Routing policies

1. Simple routing policy: Use for a single resource that performs a given function for your domain—for
example, a web server that serves content for the example.com website.
2. Weighted routing policy: Use to route traffic to multiple resources in proportions that you specify.
3. Latency routing policy: Use when you have resources in multiple AWS Regions and you want to route traffic
to the Region that provides the lowest latency.
4. Failover routing policy: Use when you want to configure active-passive failover.
5. Geolocation routing policy: Use when you want to route traffic based on the location of your users.
6. Geoproximity routing policy: Use to route traffic based on the location of your resources and, optionally,
shift traffic from resources in one location to resources in another location.
7. Multivalue answer routing policy: Use when you want Route 53 to respond to DNS queries with up to eight
healthy records that are selected at random.
8. IP-based routing policy: Use when you want to route traffic based on the location of your users and have
the IP addresses that the traffic originates from.

Latency-based routing (LBR)


LBR gives you the ability to use the DNS to route user requests to the AWS Region that will give your users the
fastest response.
Route 53 use case

A blue/green deployment is a deployment that reduces the risk of the site or application becoming unavailable
because you run two matching production environments. One environment is referred to as the blue
environment, and the other environment is referred to as the green environment.

The diagram shows an example of a blue/green deployment. Notice the two parallel environments, each with
its own ELB load balancer and Amazon EC2 Auto Scaling configuration. The Route 53 weighted routing feature
is then used to begin shifting users over from the existing (blue) environment to the new (green) environment.
This process might be done to migrate users to the new or upgraded green environment.

You can use services such as Amazon CloudWatch and Amazon CloudWatch Logs to monitor the green
environment. If problems are found anywhere in the new environment, Route 53 weighted routing can be
deployed to shift users back to the running blue servers.

When the new green environment is fully up and running without issues, the blue environment can gradually
be shut down. Because of the potential latency of DNS records, a full shutdown of the blue environment can
take anywhere from a day to a week.
AMAZON CLOUDFRONT

CloudFront is a web service that speeds up the distribution of static and dynamic web content
(such as .html, .css, .js, and image files) to users.
CloudFront delivers content through a worldwide network of data centers that are called edge
locations.

Edge locations and Regional edge caches

To deliver content to end users with lower latency, Amazon CloudFront uses a global network of more than
450 edge locations and 13 Regional edge caches in more than 90 cities across 48 countries.

CloudFront edge locations (also known as points of presence, or POPs) make sure that popular content can be
served quickly to viewers. CloudFront also has Regional edge caches that bring more content closer to viewers,
even when the content is not popular enough to stay at an edge location, to help improve performance for
that content.

Key features

Security
• Protects against network and application layer attacks
• Delivers content, APIs, or applications over HTTPS using the latest TLS version (TLSv1.3) to encrypt
and secure communication between viewer clients and CloudFront
• Supports multiple methods of access control
• Is compliant with major industry standards, including PCI-DSS, HIPAA, and ISO/IEC, to help ensure
secure delivery for sensitive data

Availability
• Automatically serves content from a backup origin when the primary origin is unavailable by using
its native origin failover capability.

Edge computing
• Offers programmable and secure edge CDN computing capabilities through CloudFront Functions
and Lambda@Edge

Real-time metrics and logging


• Is integrated with Amazon CloudWatch and automatically publishes six operational metrics per
distribution, which are displayed in a set of graphs in the CloudFront console

Continuous deployment
• Gives you the ability to deploy two separate but identical environments—called a blue/green
deployment—and support integration with the ability to roll out releases gradually without any
Domain Name System (DNS) changes

Cost-effectiveness
• Offers personalized pricing options, including pay-as-you-go, the CloudFront security savings bundle,
and custom pricing. With CloudFront, there are no upfront payments or fixed platform fees, no long-
term commitments, no premiums for dynamic content, and no requirements for professional
services to get started.
• Offers free data transfer between AWS Cloud services and CloudFront.
Companies are able to accomplish the following through CloudFront

How CloudFront delivers content to users

The diagram on the slide demonstrates what happens when users request objects after CloudFront has been
configured to deliver your content. Here is a description of each step:
1. A user accesses your website or application and sends a request for an object, such as an image file or an
.html file.
2. DNS routes the request to the CloudFront POP (edge location) that can best serve the request—typically the
nearest CloudFront POP in terms of latency—and routes the request to that edge location.
3. CloudFront checks its cache for the requested object. If the object is in the cache, CloudFront returns it to
the user. If the object is not in the cache, CloudFront does the following:
A. CloudFront compares the request with the specifications in your distribution and forwards the
request to your origin server for the corresponding object (for example, to your S3 bucket or your
HTTP server).
B. The origin server sends the object back to the edge location.
C. As soon as the first byte arrives from the origin, CloudFront begins to forward the object to the
user. CloudFront also adds the object to the cache for the next time someone requests it.

Cost estimation

Traffic distribution: Pricing varies across geographic Regions based on the edge location.
Requests: Number and type of requests
Geographic Region
Data transfer out: The amount of data transferred out of CloudFront edge locations.
AWS re/start

Serverless and
Containers
AWS LAMBDA

What is serverless computing?


With serverless computing, you can build and run applications and services without provisioning or managing
servers.
Amazon Web Services (AWS) offers many compute options. One option is Amazon Elastic Compute Cloud
(Amazon EC2), which provides virtual machines. A second option is a container solution, such as Amazon
Elastic Container Service (Amazon ECS). However, a third approach to compute is available and does not
require you to provision or manage servers. This third approach is referred to as serverless computing. With
serverless computing, you can build and run applications and services without thinking about servers.
Serverless applications do not require you to provision, scale, or manage any servers.

What is AWS Lambda?

• Is a fully managed service for serverless compute


• Provides event-driven invocation
• Offers subsecond metering
• Limits the runtime of a function to a maximum of 15 minutes
• Supports multiple programming languages
• The maximum memory allocation for a single Lambda function is 10 GB

With Lambda, you run your code only when needed, and the service scales automatically to thousands of
requests per second.

AWS Lambda usage steps

With Lambda, you can run code without provisioning or managing servers. The steps to use Lambda are as
follows:
1. Upload your code to Lambda, and Lambda takes care of everything that is required to run and scale your
code with high availability.
2. Set up your code to invoke from other AWS services, or invoke your code directly from any web or mobile
application, or HTTP endpoint.
3. AWS Lambda runs your code only when invoked. You pay only for the compute time that you consume. You
pay nothing when your code is not running.
AWS Lambda use case

Other Lambda use cases include the following:


• Automated backups
• Processing objects that are uploaded to Amazon S3
• Event-driven analysis of logs
• Event-driven transformations of data
• Internet of Things (IoT)
• Operating serverless websites

Starting and stopping Amazon EC2 instances

AWS Lambda layers

By using layers, developers can do the following:


• Configure a Lambda function to use libraries that are not included in the deployment package.
• Keep the deployment package small.
• Avoid errors in code for package dependencies.
• Share libraries with other developers.
APIS AND REST

• An API provides programmatic access to an application and are often used for programs that communicate
with each other.
- The client application sends a request to the server application by using the server application’s API.
- The server application returns a response to the client application.

• Benefits of an API include the following:


- Provides access to an application’s functions without using a graphical user interface (GUI)
- Provides a consistent way to invoke an application’s functions

APIs make computer software accessible to developers through code so that developers can build software
programs that interact with other software programs. APIs work as an intermediary so that applications are
able to communicate with each other.

RESTful APIs

• A RESTful API is an interface that two computer systems use to exchange information securely over the
internet
- Is designed for loosely coupled network-based applications
- Communicates over HTTP
- Exposes resources at specific URIs

•Many web APIs are RESTful.

RESTful API design principles

Uniform Interface: A request should be made to a single endpoint or URI when it interacts with each distinct
resource that is part of the service.
Stateless: The server does not track which requests the connecting client has made over time. It also does not
keep track of which step the client might have completed in terms of a series of actions. Instead, any session
information about the client is known only to the client itself.
Cacheable: REST clients should be able to cache the responses that they receive from the REST server.
Layered System: RESTful services support layered systems, where the client might connect to an intermediate
server. The REST server can be distributed, which supports load balancing.
Code on Demand: The server could pass code (that can be run) to the client, such as some JavaScript. This
feature extends the functionality of the REST client.
RESTful components

Client: Clients are users who want to access information from the web. The client can be a person or a
software system that uses the API. For example, developers can write programs that access weather data from
a weather system. You can also access the same data from your browser when you visit the weather website
directly.
Resource: Resources are the information that different applications provide to their clients. Resources can be
images, videos, text, numbers, or any type of data. The machine that gives the resource to the client is also
called the server. Organizations use APIs to share resources and provide web services while maintaining
security, control, and authentication. In addition, APIs help them to determine which clients get access to
specific internal resources.
Request: Requests are sent by the client to a server. The request is formatted so that the server will
understand.
Response: The response is what the server sends back to reply to the request from the client. This information
includes a status message of success or failure, a message body containing the resource representation, and
metadata about the response.

REST request format

• Endpoint (as a URL)


• Method
- GET: To read a resource
- POST: To create a resource
- PUT: To update an existing resource
- DELETE: To delete a resource
• Header
• Body (Data)

HTTP status codes


An HTTP status code is included in REST API responses. It is helpful to know what the codes mean because the
code indicates whether the request was received successfully or if there were errors. Understanding what
status codes indicate—especially the error codes—can be helpful when you troubleshoot REST API requests.
AMAZON API GATEWAY

API Gateway is service that developers can use to create and maintain APIs. For example, you can create a
REST API for an application that you run on AWS.

API Gateway is a fully managed service that handles the following:


• Scaling
• Access control
• Monitoring

Amazon API Gateway handles all the tasks that are involved in accepting and processing concurrent API calls at
scale. These tasks include traffic management, authorization and access control, monitoring, and API version
management. You pay for only the API calls that you receive and the amount of data that is transferred out.

API Gateway benefits

Efficient API development: Run multiple versions of the same API simultaneously with API Gateway, which
gives you the ability to quickly iterate, test, and release new versions.
Performance at any scale: Provide end users with the lowest possible latency for API requests and responses
by taking advantage of the AWS global network of edge locations using Amazon CloudFront.
Cost savings at scale: Decrease your costs as your API usage increases per Region across your AWS accounts
using the tiered pricing model for API requests.
Monitoring: Monitor performance metrics and information on API calls, data latency, and error rates from the
API Gateway dashboard.
Flexible security controls: Authorize access to your APIs with AWS Identity and Access Management (IAM) and
Amazon Cognito.
RESTful API options: HTTP APIs are the best way to build APIs for a majority of use cases because they can be
significantly cheaper than REST APIs.

API Gateway architecture

After your API is deployed, API Gateway provides you with a dashboard to visually monitor calls to a service.
The API interfaces that you develop have a frontend and a backend. The client uses the frontend applications
to make requests. The parts of the API implementation that communicate with the other AWS services are
referred to as the backend.

In API Gateway, the frontend is encapsulated by method requests and method responses. The backend is
encapsulated by requests and responses that work with the other AWS services. These AWS services provide
the functionality that the API exposes, and they take action accordingly.

How API Gateway is used

The diagram illustrates how APIs can be built for various applications. For example, these applications include
web and mobile applications, Internet of Things (IoT) devices, and other applications that use API Gateway. In
API Gateway, you can create, publish, maintain, and monitor APIs. These APIs can integrate with other AWS
serverless applications.

API Gateway example - Serverless web application

The application uses Amazon Simple Storage Service (Amazon S3) to hosts its presentation code and Amazon
Cognito for authentication and authorization. The application also stores its data in a DynamoDB database.
The application's user interface invokes the RESTful API exposed by API Gateway. This API forwards the user's
request to a Lambda function, which performs the application's functions, accesses the database, and returns
a response.

1. Amazon S3 hosts static web resources—including HTML, CSS, JavaScript, and image files—that are loaded in
the user's browser.
2. Amazon Cognito provides user management and authentication functions to secure the backend API.
3. The browser runs JavaScript that sends and receives data by communicating with API Gateway through REST
web services. The data that is sent through APIGateway uses the backend API that was built with Lambda.
4. DynamoDB provides the persistence layer in this example. The Lambda function that the API uses can store
data in the DynamoDB database.
AWS STEP FUNCTIONS

Step Functions is a serverless orchestration service. You can use it to combine Lambda functions and other
AWS services to build business-critical applications. With Step Functions, you can quickly create distributed
applications that leverage AWS services in addition to your own microservices.

Orchestration centrally manages a workflow by breaking it into multiple steps, adding flow logic, and tracking
the inputs and outputs between the steps.

You can use AWS Step Functions to coordinate AWS services into serverless workflows. Workflows consist of a
series of steps. The output of one step is the input to the next step.
As your applications run, Step Functions maintains the application state, tracking exactly which workflow step
your application is in, and stores an event log of data that is passed between application components. That
means if the workflow is interrupted for any reason, you’re application can pick up right where it left off.

Using Step Functions


Step Functions gives you the ability to reuse components and use different services in your application.
Step Functions also does the following:
• Coordinates existing AWS Lambda functions and microservices into applications
• Keeps application logic separated from implementation

Core concepts
Step Functions is based on workflows (or state
machines) and tasks.

The workflows that you build with Step Functions are


called state machines, and each step of your workflow
is called a state. Tasks perform work, either by
coordinating another AWS service or an application
that you can host basically anywhere. You can reuse
components, edit the sequence of steps, or swap out
the code called by task states as your needs change.

Benefits
Features

Step Functions is a managed serverless service. Its main features include the following:
• Automatic scaling
• High availability
• Pay per use
• Security and compliance

Use cases
Step Functions is useful for creating end-to-end workflows to manage jobs with dependent components and
for dividing business processes into a series of steps.

Step Functions example

You can use Step Functions to implement a business process as a series of steps that make up a workflow. The
individual steps in the workflow can invoke a Lambda function that has some business logic. This slide shows
an example.

In this example of a banking system, a new bank account is created after validating a customer’s name and
address by using the account-processing-workflow AWS Step Functions workflow. The workflow begins with
two Lambda functions—CheckName and CheckAddress—running in parallel as task states. Once both are
complete, the workflow initiates the OpenNewAccount Lambda function. You can define retry and catch
clauses to handle errors from task states. You can use predefined system errors or handle custom errors
thrown by these Lambda functions in your workflow. Because your workflow code takes on error handling, the
Lambda functions can focus on the business logic and have less code.
CONTAINERS ON AWS

A container is an application and its dependencies, which can be run in resource-isolated processes. Containers
provide a common interface for migrating applications between environments. Containers are isolated but
share an OS and, where appropriate, bins/libraries.
Containers can run on any Linux system with appropriate kernel-feature support and the Docker daemon
present. This ability makes containers portable. Your laptop, your VM, your Amazon Elastic Compute Cloud
(Amazon EC2) instance, and your bare metal server are all potential hosts. The lack of a hypervisor
requirement also results in almost no noticeable performance overhead. The processes are communicating
directly to the kernel and are largely unaware of their container silo. Most containers boot in only a couple of
seconds.

Benefits of containers

Environmental consistency: the application’s code, configurations, and dependencies are packaged into a
single object.
Process isolation: they have no shared dependencies or incompatibilities because each container is isolated
from the other. Process isolation provides operational efficiency.
Operational efficiency: you can run multiple applications on the same instance.
Developer productivity: increase developer productivity by removing cross-service dependencies and
conflicts.
Version control: you can track versions of your application code and their dependencies. Docker container
images have a manifest file (Dockerfile).

Docker

• Docker is an application platform used to create, manage, and run containers.


• With Docker, developers and engineers can build, test, deploy, and run containers.

Benefits of Docker

Microservices architecture: Public documentation on Docker recommends that you run one service per
container.
Stateless: They consist of read-only layers. This means that after the container image has been created, it does
not change.
Portable: Your application is independent from the configurations of low-level resources, such as networking,
storage, and OS details. This feature provides portability. For example, if your application runs in a Docker
container, it will run anywhere.
Single, immutable artefact: Docker also assists with packaging your applications and dependencies in a single,
immutable artifact.
Reliable deployments: When a developer finishes writing and testing code, they can wrap it in a container and
publish it directly to the cloud, and it will instantly work because the environment is the same.

Components of Docker
AWS CONTAINER SERVICES

Registry: Amazon Elastic Container Registry (Amazon ECR), where you can store your container images.
Management: the deployment, scheduling, and scaling of containerized applications.
AWS services are Amazon Elastic Container Service (Amazon ECS) and Amazon Elastic Kubernetes Service
(Amazon EKS). Amazon ECS provisions new application container instances and compute resources. Use
Amazon EKS to deploy, manage, and scale containerized applications using Kubernetes on AWS.
Hosting: is where the containers run. You can currently run your containers on Amazon ECS using the Amazon
EC2 launch type (where you get to manage the underlying instances on which your containers run), or you can
choose to run your containers in a serverless manner with the AWS Fargate launch type.

Amazon ECR
A fully managed Docker container registry that developers can use to store, manage, and deploy Docker
container images.

Amazon ECS
A highly scalable, high-performance container management service that supports Docker containers.
With Amazon ECS, you can run applications on a managed cluster of EC2 instances. It provides flexible
scheduling. Amazon ECS uses a built-in scheduler or it uses a third-party scheduler, such as Apache Mesos. You
can also perform task, service, or daemon scheduling.

Amazon EKS
A managed service that you can use to run Kubernetes on AWS without needing to install and operate your
own Kubernetes clusters.
With Amazon EKS, AWS manages upgrades and high availability services for you. Amazon EKS runs three
Kubernetes managers across three Availability Zones to provide high availability. Amazon EKS automatically
detects and replaces unhealthy managers and provides automated version upgrades and patching for the
managers.

AWS Fargate
A compute engine for Amazon ECS that you can use to run containers without needing to manage servers or
clusters.
Deploying to AWS

Deploying your managed container solutions on AWS involves selecting an orchestration tool and a launch
type.

Amazon ECS is a fully managed container orchestration service that provides the most secure, reliable, and
scalable way to run containerized applications.

Amazon EKS is a fully managed Kubernetes service that provides the most secure, reliable, and scalable way to
run containerized applications using Kubernetes.
AWS re/start

AWS Database
Services
INTRODUCTION TO DATABASES ON AWS

Advantages of databases on AWS

Choosing a database service


You can choose a database based on purpose or application needs. Database workload factors to consider
include the following: • Data structure
• Data size
• Computation requirements
• Cost
• Access patterns
• Performance

Types of AWS data storage services

SQL and NoSQL database comparison


Unmanaged versus managed services

Managed versus unmanaged responsibilities


You are responsible for the following tasks based on database type.

AWS database use cases


AWS database recommendations
AMAZON REDSHIFT

Data warehouses
A data warehouse is a central repository of information that can be analysed to make more informed
decisions. A data warehouse can contain multiple databases.

Benefits of a data warehouse

With a data warehouse, users can do the following:


• Make informed decisions.
• Consolidate data from many sources.
• Analyse historical data.
• Confirm data quality, consistency, and accuracy.
• Separate analytics processing from transactional databases, improving the performance of both systems.

Data warehouse architecture

A data warehouse architecture consists of tiers.

• The top tier is the frontend client that presents results through
reporting, analysis, and data mining tools.

• The middle tier consists of the analytics engine that is used to access
and analyse the data.

• The bottom tier of the architecture is the database server, where


data is loaded and stored.

Data warehouse use case


Typically, businesses use a combination of a database, a data lake, and a data warehouse to store and analyse
data.
Amazon Redshift overview

Amazon Redshift is a fully managed data warehouse service in the cloud and is scalable with virtually no
downtime.
• You can use it to run complex analytic queries against petabytes of structured data.
• It uses sophisticated query optimization, columnar storage on high-performance local disks, and parallel
query execution.
• Amazon Redshift monitors clusters automatically and nearly continuously and has encryption built in.

Amazon Redshift features

Amazon Redshift use cases

Enterprise data warehouse (EDW)


• Migrate at a pace that customers are comfortable with.
• Experiment without large upfront cost or commitment.
• Respond faster to business needs.

Big data
• Incur a low price point for small customers.
• Ease deployment and maintenance via managed service.
• Focus more on data and less on database management.

Software as a service (SaaS)


• Scale the data warehouse capacity as demand grows.
• Add analytics functionality to applications.
• Reduce hardware and software costs by an order of magnitude.
AMAZON AURORA

• Aurora is a highly available, resilient, and cost-effective managed relational database. Amazon Relational
Database Service (Amazon RDS) fully manages Aurora.
• The Aurora database engine is fully compatible with existing MySQL and PostgreSQL open-source databases
and regularly adds compatibility for new releases.
• Aurora can provide up to five times the throughput of standard MySQL and up to three times the throughput
of standard PostgreSQL that runs on the same hardware.
• It combines the performance and availability of high-end commercial databases with the simplicity and cost-
effectiveness of open source databases.

Key features and benefits

The following are why customers use Aurora: • Compatible


• Pay-as-you-go service
• Managed service
• Fault tolerant
• Resilient design

Aurora key components

Aurora consists of the following components:


• Instance - Primary instance
- Aurora replica instance
• Database (DB) cluster
• Cluster volume

Aurora DB cluster architecture

Use cases
AWS DATABASE MIGRATION SERVICE (AWS DMS)

AWS DMS is a service that migrates databases to Amazon Web Services (AWS) quickly and securely.

Features of AWS DMS

• Migrate data to and from most databases.


• Keep the source database operational during migration.
• Keep applications live or running during the migration.
• Conduct syntax migration conversion.
• Replicate data near-continuously.
• Consolidate databases.
• Deploy to multiple Availability Zones for high availability and failover support.

Near-continuous data replication

Database consolidation
Database migration process

Components of AWS DMS

The AWS DMS architecture consists of the following components: • Replication instance
• Task
• Source
• Target

AWS DMS example


Types of database migrations

Homogeneous database migrations


In homogeneous database migrations, the source and target database engines are the same or are compatible.

Heterogeneous database migrations


Heterogeneous database migrations involve two steps: converting the schema by using the AWS SCT and
migrating the data by using AWS DMS.
AWS SCT

• The AWS SCT converts your existing database schema and code objects from one database engine to
another.
• It is used for heterogeneous migrations.
• You can convert a relational schema or a data warehouse schema.
• The database objects that the AWS SCT converts include the source database schema, views, stored
procedures, and functions.
• The AWS SCT can also scan your application source code for embedded SQL statements and convert them so
that they are compatible with the target database.

Examples of conversions supported by the AWS SCT


AWS re/start

AWS Networking
Services
AWS CLOUD NETWORKING AND AMAZON VIRTUAL PRIVATE CLOUD

AWS Cloud networking

In its most basic form, a cloud-based network is a private IP address space where you can deploy computing
resources. In Amazon Web Services (AWS), a virtual private cloud (VPC) component provides this private
network space. A VPC enables you to define a virtual network in your own logically isolated area within the
AWS Cloud. Inside this virtual network, you can deploy AWS computing resources. These resources include, for
example, Amazon Elastic Compute Cloud (Amazon EC2) or Amazon Relational Database Service (Amazon RDS)
instances. You can also define how—and whether—your private network space connects to endpoints in your
network topology.

In the example, a VPC contains three EC2 instances and an RDS instance. It is connected to the internet and the
corporate data center. It is also connected to a secondary VPC.

AWS networking components


A VPC can span multiple Availability Zones, and its key component types include:
• Subnet – Subnets are logical network segments within your VPC. They enable you to subdivide your VPC
network into smaller networks inside a single Availability Zone. A subnet is public if it is attached to an internet
gateway or private if it is not. A subnet is required to deploy an instance into a VPC.
• Security group – A security group is a set of firewall rules that secure instances. They allow or block inbound
and outbound traffic into an instance (stateful). If you do not specify a particular group at launch time, an
instance is automatically assigned to the default security group for the VPC. A security group is associated with
an instance.
• Primary network interface (elastic network interface) – An elastic network interface is a virtual network
interface (NIC) that connects an instance to a network. Each instance in a VPC has a default network interface,
the primary network interface, which cannot be detached from the instance.
• Router – A router is a component that routes traffic within the VPC.
• Internet gateway – An internet gateway is a VPC component that enables communication between instances
in a VPC and the internet.
• Virtual private gateway – A virtual private gateway is the component that is defined on the AWS side of a
virtual private network (VPN) connection. A VPN connection provides a secure and encrypted tunnel between
two network endpoints.
• Customer gateway – A customer gateway is a physical device or software application that is defined on the
client side of a VPN connection.
 Route table – A route table is a mechanism used for routing traffic that originates from an associated subnet
in a VPC. It contains a set of rules (also called routes) that determine where traffic is sent. Routes in a route
table consist of a destination and a target. The router reads the route like this: “Any traffic that goes to
destination should be routed through target.” A target can be a specific instance ID, an elastic network
interface ID, an internet gateway, or a virtual private gateway.

Amazon VPC and VPCs

A VPC is a virtual network that is provisioned in a logically isolated section of the AWS Cloud:
• Supports logical separation with subnets
• Offers fine-grained security
• Supports an optional hardware virtual private network (VPN)

Use Amazon VPC to provision a virtual private cloud (VPC)


• Logically isolated section of the AWS Cloud for running AWS resources
• Virtual network that you define and control
• Select your own IP address range, create subnets, and configure route tables and network gateways

Amazon VPC configuration: IP addressing

• Valid private IP address ranges are defined by Request for Comment (RFC) 1918.
• In a VPC, you can only define networks between /16 and /28.
Amazon VPC reserved IP addresses
VPC CONNECTIVITY OPTIONS

VPC peering limitations

• No overlapping IP address ranges.


• No transitive peering, edge routing, or internet gateway access.
• No NAT routing between VPCs.
• No Domain Name System (DNS) lookup resolution of private IP addresses.
• No cross-referencing of peer security groups across Regions.

 A NAT device forwards traffic from an instance that is in a private subnet to the internet or other AWS
services and then sends the response back to the instance.
• VPC peering connects two VPCs so that you route traffic between them using private addresses.
• A Site-to-Site VPN connection establishes a secure connection between your on-premises equipment and
your VPCs.
• A VPC endpoint privately connects your VPC to supported AWS services and to services that are powered by
PrivateLink without leaving the AWS network.
• Transit Gateway establishes a network transit hub that you can use to interconnect your VPCs and on-
premises networks without using the public internet.
SECURING AND TROUBLE SHOOTING YOUR NETWORK

Layered network defence


Secure your VPC at multiple levels.

Network ACLs

A network ACL allows or denies traffic in and out of subnets


and has the following characteristics:
• It defines traffic rules in an inbound rules table and an
outbound rules table.
• It is stateless. Even if rules allow traffic to flow in one
direction, you must explicitly allow responses to flow in the
opposite direction.
 The default network ACL allows all inbound and outbound
traffic.

Security groups

A security group allows traffic to or from an elastic network interface and has the following characteristics:
•It defines traffic rules in an inbound rules table and an outbound rules table.
•It is configured by default to do the following: • Deny all inbound traffic.
• Allow all outbound traffic.
• Allow traffic between resources that are assigned to the
same security group.
• It is stateful. If rules allow traffic to flow in one direction, responses
can automatically flow in the opposite direction.
Bastion host

A bastion host provides secure access from a public subnet to a


private subnet and has the following characteristics:
• It is an EC2 instance.
• It provides a jump point to gain access to instances or resources in a
private subnet from the internet.
• It requires a key pair for itself and the private instances that it
connects to.

Common troubleshooting tasks

• Verify that the instance is up and running. Check that it has passed both the System Status and Instance
Status checks.
• Verify that the security groups that are associated with the instance allow connections for the required
protocols and ports.
• Verify that the network ACLs that are associated with the subnet allow traffic from the necessary ports and
protocols.
• Verify that the route table that is associated with the subnet has destination rules that point to the
appropriate targets.

Troubleshooting instance connections

Check the following if you cannot connect to an instance through the internet:
• Verify that the public IP address or Domain Name System (DNS) name that you are using is correct.
• Verify that the instance has a public IP address or Elastic IP address.
• Verify that an internet gateway is attached to the instance’s VPC.
• Verify that the route table of the instance’s subnet has a route rule for the destination 0.0.0.0/0 through the
internet gateway.

Troubleshooting SSH connections

Check the following if you cannot connect to an instance through Secure Shell (SSH):
•Verify that the instance's IP address or hostname is correct.
•Verify the instance connection credentials: instance private key, or username and password.
•Run the AWSSupport-TroubleshootSSH automation document to help you find and resolve the problem.

Troubleshooting NAT

Check the following if your NAT configuration does not work:


• Verify that the route table has a route to the NAT instance or NAT gateway.
• If using a NAT instance, perform the following in addition:
- Verify that the source or destination check is disabled.
- Restart the NAT instance.

Troubleshooting VPC peering

Check the following if you cannot reach resources in a peered network:


• Make sure that the peering request was approved.
• Verify that the security group rules allow network traffic between the peered VPCs.
• Check whether the network ACLs incorrectly deny all external traffic.
AWS re/start

Storage and
Archiving
CLOUD STORAGE OVERVI EW

Cloud storage is a service that stores data on the internet through a cloud computing provider that manages
and operates data storage as a service.

Cloud storage formats

Use cases for cloud storage

AWS Cloud storage

The AWS storage services can be grouped into the following general categories:
• Object storage
• File storage
• Block storage
• Hybrid storage

Object, file, and block storage on AWS


Hybrid cloud storage and edge computing on AWS

AWS Cloud storage scenarios


AMAZON EBS

• Amazon EBS provides persistent block storage volumes.


• Each EBS volume is automatically replicated within its Availability Zone.
• With Amazon EBS, you can scale your usage up or down within minutes.

Amazon EBS features

Amazon EBS use cases

EBS volume use cases:


• Boot volumes and primary storage for Amazon Elastic Compute Cloud (Amazon EC2) instances
• Data storage with a file system
• Database hosts

Snapshot use cases:


• Create a backup of critical workloads.
• Recreate EBS volumes.
• Share and copy data.

EBS volume types

Solid state drives (SSDs): • Provisioned IOPS SSD


• General Purpose SSD volumes

Hard disk drives (HDDs): • Throughput Optimized HDD


• Cold HDD

Volume type comparison


Use cases for EBS volume types: SSD

Provisioned IOPS • I/O-intensive workloads


• Relational databases
• NoSQL databases

General Purpose • Recommended option for most workloads


• System boot volumes
• Virtual desktops
• Low-latency interactive apps
• Development and test environments

Use cases for EBS volume types: HDD

Throughput-optimized • Streaming workloads that require consistent, fast throughput at a low price
• Big data
• Data warehouses
• Log processing
• Not a boot volume

Cold • Throughput-oriented storage for large volumes of data that are


infrequently accessed
• Scenarios where the lowest storage cost is important
• Not a boot volume

Amazon Data Lifecycle Manager

Amazon Data Lifecycle Manager does the following:


• Automates the creation, retention, and deletion of snapshots
• Uses tags to identify EBS volumes to backup
• Uses a lifecycle policy to define the desired backup and retention actions
• Requires an AWS Identity and Access Management (IAM) role to allow the management actions
INSTANCE STORE

Instance stores provide temporary block-level storage for your EC2 instance. This storage is located on disks
that are physically attached to the host computer.

Amazon EC2 instance store

• Provides temporary, non-persistent storage for your EC2 instances


• Is physically attached to the host of the EC2 instance, which allows for fast low-latency storage
• Is made of volumes running on virtual devices providing ephemeral block storage
• Is dedicated to a particular instance

How instance stores work

• You use the block device mapping feature of the Amazon EC2 API and the AWS Management Console to
attach an instance store to an instance.
• Instance store data persists for only the lifetime of its associated instance.
• You cannot create or destroy instance store volumes independently from their instances.

• You can control the following: • Whether instance stores are exposed to the EC2 instance
• What device name is used

Instance store features

• Features are available for many instance types but not all instance types.
• The number, size, and type—such as hard disk drive (HDD) compared with solid state drive (SSD)—differ by
instance type.

•Note the following information about mounting an instance:


• An instance store must be mounted before you can access it.
• Mounting occurs automatically or manually on Linux depending on the
instance type.

Use cases

• Instance store volumes are used for temporary storage of information that is continually changing, such as
the following: • Buffers
• Caches
• Scratch data
• Other temporary content

• Instance store volumes are used for data that is replicated across a fleet of instances, such as a load-
balanced pool of web servers.
AMAZON EFS

Amazon EFS is scalable, fully managed, elastic Network File System (NFS) storage for use with AWS Cloud
services and on-premises resources.

Amazon EFS is a petabyte-scale, low-latency file system that does the following:
• Supports NFS
• Is compatible with multiple AWS services
• Is compatible with all Linux-based instances and servers
• Uses tags

Benefits

Performance attributes

Storage classes
Standard storage classes • EFS Standard
• EFS Standard Infrequent Access (Standard-IA)

One Zone storage classes • EFS One Zone


• EFS One Zone-IA (One Zone-IA)

Performance modes •General Purpose


•Max I/O

Throughput modes • Elastic Throughput


• Bursting Throughput
• Provisioned Throughput

Use cases

Amazon EFS is designed to provide performance for a broad spectrum of workloads and applications, including
the following:

• Home directories
• File system for enterprise applications
• Application testing and development
• Database backups
• Web serving and content management
• Media workflows
• Big data analytics
Amazon EFS architecture

How to use Amazon EFS

To set up Amazon EFS in your VPC, follow these steps:


1. Create your EFS file system.
2. Create your Amazon EC2 resources, and launch your EC2 instance.
3. Create your mount targets in the appropriate subnets.
4. Connect to your EC2 instance, and mount the EFS file system.
5. Clean up your resources, and protect your AWS account.
STORAGE WITH AMAZON S3

Amazon S3 is an object storage service that provides secure, durable, and highly available data storage in the
AWS Cloud.
You can use Amazon S3 to store and retrieve any amount of data (objects) at anytime from anywhere on the
web.

Amazon S3 features

• Storage classes
• Storage management
• Access management and security
• Data processing
• Storage logging and monitoring
• Analytics and insights
• Strong consistency

Amazon S3 storage classes

Buckets
A bucket is a container for objects that are stored in Amazon S3. Every object is contained in a bucket.

Objects
Objects are the fundamental entities that are stored in Amazon S3. Objects consist of object data and
metadata.

Object keys
The unique identifier for an object within a bucket

Regions
The geographical area where Amazon S3 will store the buckets that you create
Additional Amazon S3 features

S3 Intelligent-Tiering

S3 Intelligent-Tiering is designed to optimize cost by automatically storing objects in three access tiers:
• Frequent Access
• Infrequent Access
• Archive Instant Access

• Optional tiers include the following: • Archive Access


• Deep Archive Access

How it works

Amazon S3 Lifecycle policies


Access and security

Amazon S3 public access settings

Amazon S3 Block Public Access provides four public access settings:


•Block new public access control lists (ACLs) and uploading public objects
•Remove public access granted through public ACLs.
•Block new public bucket policies.
•Block public and cross-account access to buckets that have public policies.

Amazon S3 Object Lock


With S3 Object Lock, you can prevent an object from being deleted or overwritten for a fixed amount of time
or indefinitely.

You can manage object retention in two ways: • Retention periods


• Legal holds

The following are the two retention modes: • Compliance


• Governance

Event notification for Amazon S3


AMAZON S3 GLACIER

Amazon S3 Glacier is a storage service purpose-built for data archiving. It provides high performance, flexible
retrieval, and low-cost archive storage in the cloud.

Amazon S3 Glacier storage classes

Amazon S3 Glacier data model concept overview

Vault
A vault is a container for storing archives.
Unique URI form: https://region-specific-endpoint/account-id/vaults/vault-name

Archive
An archive is any data, such as a photo, video, or document.
Unique URI form: https://region-specific-endpoint/account-id/vaults/vault-name/archives/archive-id

Job
An Amazon S3 Glacier job can retrieve an archive or get an inventory of a vault.
Unique URI form: https://region-specific-endpoint/account-id/vaults/vault-name/jobs/job-id

Notification configuration
An Amazon S3 Glacier notification configuration can notify you when a job is completed.
Unique URI form: https://region-specific-endpoint/account-id/vaults/vault-name/notification-configuration
Amazon S3 access to archives
Amazon S3 Glacier provides three archive retrieval options:
• Expedited: 1–5 minutes • Standard: 3–5 hours • Bulk: 5–12 hours

Security features

Resource-based security policies


Both vault access policies and Vault Lock policies manage permissions. However, only vault access policies can
be modified at any time.

Comparing Amazon S3 and Amazon S3 Glacier

Security comparison

Amazon S3 - Encryption is an optional step


.Amazon S3 Glacier - Data is encrypted by default.

How to access Amazon S3 Glacier


AWS STORAGE GATEWAY

Storage Gateway is a hybrid storage service that enables on-premises applications to use AWS Cloud storage.
You can use Storage Gateway for backup and archiving, disaster recovery (DR), cloud data processing, storage
tiering, and migration.
Storage Gateway supports file, volume, and tape interfaces.

How Storage Gateway works

Storage Gateway features

Storage Gateway includes the following key features:


• Provides durable storage of on-premises data in the AWS Cloud
• Uses standard storage protocols
• Provides fully managed caching
• Transfers data in an optimized and secure manner
• Can be implemented on premises as a virtual machine (VM) or a hardware device

Storage Gateway types


S3 File Gateway
Store and access files as objects in Amazon S3.

Volume Gateway
Access block storage as volumes on Amazon S3.

Tape Gateway
Back up and archive data to virtual tape on Amazon S3.

Storage Gateway use cases

• Move backups and archives to the cloud.


• Reduce on-premises storage with cloud-backed file shares.
• Provide on-premises applications with low-latency access to data that is stored in AWS.
• Provide on-premises applications with seamless use of AWS storage.

Use Storage Gateway for hybrid scenarios where some storage is needed on premises but some storage can be
offloaded to cloud storage services (Amazon S3, Amazon S3 Glacier, or Amazon EBS).
AWS TRANSFER FAMILY AND OTHER MIGRATION SERVICES

What is the Transfer Family?

The Transfer Family is a secure transfer service that you can use to transfer files into and out of AWS storage
services.
The Transfer Family supports transferring data from or to the following AWS storage services:
• Amazon Simple Storage Service (Amazon S3) storage buckets
• Amazon Elastic File System (Amazon EFS) Network File System (NFS) file systems

AWS Transfer for SFTP

• Retains existing workflows


• Stores data in an S3 bucket
• Connects directly with your identity provider systems

How the Transfer Family works

Migration services

What is AWS DataSync?


DataSync is an online data transfer service that automates, and accelerates the moving of data between on-
premises storage systems and AWS storage services. It also moves data between AWS storage services.

DataSync features
• Synchronizes between on premises and AWS
• Is efficient and fast
• Is a managed service
• Connects over the internet or AWS Direct Connect
• Includes AWS DataSync Agent (NFS protocol)

DataSync use cases


• Data migration
• Archiving cold data
• Data protection
• Data movement for timely in-cloud processing
How DataSync works

What is the AWS Snow Family?


Snowball is a large-scale data-transport solution that uses secure physical devices to transfer large amounts of
data into and out of the AWS Cloud.

Snowball features

Snowball use cases

• Sensors or machines
• Data collection in remote locations
• Media and entertainment aggregate content
AWS re/start

Jumpstart on
AWS
AMAZON CLOUDWATCH

• Amazon CloudWatch monitors the state and utilization of most resources that you can manage under AWS.

• It enables you to – » Track resource and application performance


» Collect and monitor log files from EC2 instances, AWS CloudTrail, Amazon Route
53, and other sources.
» Get notifications when an alarm goes off

• CloudWatch consists of three primary components – » Metrics


» Alarms
» Events

Monitoring resource performance

CloudWatch helps with performance monitoring. However, by itself, it will not add or remove EC2 instances.
Amazon EC2Auto Scaling can help with this situation.
With Amazon EC2 Auto Scaling, you can maintain the health and availability of your fleet. You can also
dynamically scale your EC2 instances to meet demands during spikes and lulls.

CloudWatch has two different monitoring options:

• Basic Monitoring for Amazon EC2 instances: Seven pre-selected metrics at a 5-minute frequency and three
status check metrics at a 1-minute frequency, for no additional charge.
• Detailed Monitoring for Amazon EC2 instances: All metrics that are available to Basic Monitoring at a 1-
minute frequency, for an additional charge. Instances with detailed monitoring enabled provide data
aggregation by Amazon EC2, Amazon Machine Image (AMI) ID, and instance type.

CloudWatch agent collects system-level metrics:


– EC2 instances
– On-premises servers

Amazon CloudWatch actions


Amazon CloudWatch alarms
You can create a CloudWatch alarm that watches a single CloudWatch metric or the result of a math
expression that is based on multiple CloudWatch metrics. The alarm performs one or more actions based on
the value of the metric or expression relative to a threshold over several time periods.

• Test a selected metric against a specific threshold (greater than or equal to, less than or equal to)
• The ALARM state is not necessarily an emergency condition

CloudWatch monitoring example

Metric components
Metrics are the fundamental concept in CloudWatch. A metric represents a time-ordered set of data points
that are published to CloudWatch. Think of a metric as a variable to monitor, and the data points represent the
values of that variable over time.

Metrics are uniquely defined by a name, a namespace, and zero or more dimensions.

Namespace is a container for CloudWatch metrics. Metrics in different namespaces are isolated from each
other, so that metrics from different applications are not mistakenly aggregated into the same statistics.

Dimension is a name-value pair that uniquely identifies a metric. You can assign up to 10 dimensions to a
metric. Each metric has specific characteristics that describe it, and you can think of dimensions as categories
for those characteristics. Dimensions help you design a structure for your statistics plan. You can use
dimensions to filter the results that CloudWatch returns.
Period is the length of time that is associated with a specific CloudWatch statistic. Periods are defined in
numbers of seconds. You can adjust how the data is aggregated by varying the length of the period. A period
can be as short as 1 second or as long as 1 day (86,400 seconds).

Standard and custom metrics

Standard metrics:
• Grouped by service name
• Display graphically so that selected metrics can be compared
• Only appear if you have used the service in the past 15 months
• Reachable programmatically through the AWS Command Line Interface (AWS CLI) or application
programming interface (API)

Custom metrics:
• Grouped by user-defined namespaces
• Publish to CloudWatch by using the AWS CLI, an API, or a CloudWatch agent

Monitoring and security

Use CloudWatch to monitor for suspicious activity, such as:


– Unusual, prolonged spikes in service usage, such as CPU, disk activity, or Amazon Relational Database
(Amazon RDS) usage
– Set alerts on billing metrics (you must enable this feature in account settings)

CloudWatch automatic dashboards


Amazon CloudWatch dashboards are customizable homepages in the CloudWatch console that you can use to
monitor your resources in a single view. You can create customized views of the metrics and alarms for your
AWS resources.
You can get aggregated views of the health and performance of all AWS resources through CloudWatch
automatic dashboards. This feature enables you to monitor and explore account-based and resource-based
views of metrics and alarms. You can drill down to figure out the root cause of performance issues.

Activate detailed instance monitoring


By default, EC2 instances are enabled for basic CloudWatch monitoring, with data available in 5-minute
increments as part of the AWS Free Tier. However, you can also enable detailed monitoring at an additional
cost. After detailed monitoring is enabled, the monitoring data becomes available in 1-minute increments.
AMAZON CLOUDWATCH –LOGS AND EVENTS

Amazon CloudWatch Events


instanceAmazon CloudWatch Events delivers a near-real-time stream of system events that describe changes
in AWS resources. By using simple rules that you can configure, you can match events and route them to one
or more target functions or streams. CloudWatch Events becomes aware of operational changes as they occur.
It responds to these operational changes by sending messages, activating functions, making changes, and
capturing state information.

Events – An event indicates a change in your AWS environment. AWS resources can generate events when
their state changes. For example, Amazon Elastic Compute Cloud (Amazon EC2) generates an event when the
state of an EC2 instance changes from pending to running. You can generate custom application-level events
and publish them to CloudWatch Events. You can also set up scheduled events that are generated on a
periodic basis.

Targets – A target processes events. Example targets include EC2 instances, AWS Lambda functions, Amazon
Simple Notification Service (Amazon SNS) topics, and Amazon Simple Queue Service (Amazon SQS) queues.

Rules – A rule matches incoming events and routes them to targets for processing. A single rule can route to
multiple targets, all of which are processed in parallel. This enables different parts of an organization to look
for and process the events that are of interest to them.

Amazon CloudWatch Logs


You can use Amazon CloudWatch Logs to monitor, store, and access your log files from EC2 instances, AWS
CloudTrail, Amazon Route 53, and other sources. You can then retrieve the associated log data from
CloudWatch Logs. You can monitor your logs, in near-real time, for specific phrases, values, or patterns.

Configure – Decide what information you need to capture in your logs, and where and how it will be stored.

Collect – Instances are provisioned and removed in a cloud environment. You need a strategy for periodically
uploading a server’s log files so that this valuable information is not lost when an instance is eventually
terminated.

Analyse – After all the data is collected, it is time to analyse it. Using log data gives you greater visibility into
the daily health of your systems. It can also provide information on upcoming trends in customer behaviour,
and insight into how customers currently use your system.

Amazon CloudWatch Logs functionality

• Automatically collecting logs—for example, from EC2 instances


• Aggregating data into log groups
• Being able to configure metric filters on a log group -
– Look for specific string patterns
– Have each match increment a custom CloudWatch metric
– Use the metric to create CloudWatch alarms or send notifications
• Querying logs and creating visualizations with CloudWatch Logs Insights
AWS CLOUDTRAIL

AWS CloudTrail is a service that:

• Logs, continuously monitors, and retains account activity that is related to actions across your AWS
infrastructure
• Records application programming interface (API) calls for most AWS services
– AWS Management Console and AWS Command Line Interface (AWS CLI) activity are also recorded
• Is supported for a growing number of AWS services
• Automatically pushes logs to Amazon Simple Storage Service (Amazon S3) after it is configured
• Will not track events within an Amazon Elastic Compute Cloud (Amazon EC2) instance
– Example: Manual shutdown of an instance

CloudTrail can help you answer questions that require detailed analysis.

Configure a trail

By default, when you access the CloudTrail event history for the Region that you are viewing, CloudTrail shows
only the results from the last 90 days. These events are limited to management events with create, modify,
and delete API calls; and also account activity. For a complete record of account activity—including all
management events, data events, and read-only activity—you must configure a CloudTrail trail.

The steps to configure a trail are:

1. Configure a new or existing Amazon Simple Storage Service (Amazon S3) bucket for
uploading log files.
2. Define a trail to log desired events (all management events are logged by default).
3. Create an Amazon Simple Notification Service (Amazon SNS) topic to receive notifications.
4. Configure Amazon CloudWatch Logs to receive logs from CloudTrail (optional).
5. Turn on log file encryption and integrity validation for log files (optional).
6. Add tags to your trail (optional).
Monitoring and security

Examine CloudWatch Logs and CloudTrail to detect potential unauthorized use.

When you monitor the activity on your account and secure your resources and data, the features of
CloudWatch and CloudTrail are complementary. Using both services is a best practice. For example, you can
examine the logs from CloudWatch Logs and also examine CloudTrail entries to detect potential unauthorized
use.

Other example uses of these services include:


• Monitoring for failed AWS Management Console sign-in attempts, especially sign-in attempts from
suspicious IP addresses
• Detecting unauthorized access to services through API calls
• Identifying a suspicious launching of AWS resources
AWS SERVICE INTEGRATION WITH AMAZON ATHE NA

Amazon Athena characteristics:

• Handles large scale datasets with ease


• Serverless — Avoids need to extract, transform, and load (ETL)
• Pay only for the queries that you run
• Athena automatically runs queries in parallel so most results come back in seconds.

To use Amazon Athena, point to your data in Amazon Simple Storage Service (Amazon S3), define the schema,
and start querying by using standard SQL. Most results are delivered within seconds. With Athena, you do not
need complex ETL jobs to prepare your data for analysis. Athena makes it easy for anyone with SQL skills to
quickly analyze large-scale datasets.
Athena works with various standard data formats, including comma-separated values (CSV), JavaScript Object
Notation (JSON), Optimized Row Columnar (ORC), Apache Avro, and Apache Parquet. Athena is ideal for quick,
ad hoc querying. However, it can also handle complex analysis, including large joins and arrays. Athena uses
Amazon S3 as its underlying data store, which makes your data highly available and durable.
Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.
You can quickly query your data without needing to set up and manage any servers or data warehouses.
Athena enables you to query all your data in Amazon S3 without needing to set up complex processes to
extract, transform, and load ETL) the data.
Also, Athena offers fast, interactive query performance. Athena automatically runs queries in parallel, so most
results come back within seconds.

Getting started with Amazon Athena

1. Create an Amazon Simple Storage Service (Amazon S3) bucket


and load data into it
– Alternatively, have an existing bucket with data already in it

2. Define the schema (table definition) that describes the data


structure

3. Start querying data by using structured query language (SQL)

AWS service integrations with Amazon Athena

Athena makes it easier to query logs from services, such as:


• AWS CloudTrail logs to enhance your analysis of AWS service activity.
• Application Load Balancer logs to see the source of traffic, latency, and bytes that are transferred to and
from ELB instances and backend applications.
• Amazon Virtual Private Cloud (Amazon VPC) Flow Logs to investigate network traffic patterns, and to
identify threats and risks across your Amazon VPC network.
INTRODUCTION TO AWS ORGANIZATIONS

Diagram of an organization or root in AWS Organizations

AWS Organizations is an account management service that enables you to consolidate multiple AWS accounts
into an organization that you create and centrally manage. AWS Organizations include consolidated billing and
account management capabilities that help you to better meet the budgetary, security, and compliance needs
of your business.
The diagram shows a basic organization, or root. This example organization consists of seven accounts that are
organized into six organizational units (OUs). An OU is a container for accounts within a root. An OU can also
contain other OUs, which enables you to create a hierarchy that looks like an upside-down tree. The tree has a
root at the top and branches of OUs that reach down, ending in accounts that are the leaves of the tree.
When you attach a policy to one of the nodes in the hierarchy, it flows down and affects all the branches and
leaves. This organization has several policies that are attached to some of the OUs or directly to accounts.
An OU can have only one parent and, currently, each account can be a member of exactly one OU. An account
is a standard AWS account that contains your AWS resources. You can attach a policy to an account to apply
controls to only that one account.

Key features and benefits


Security with AWS Organizations

• Control access with AWS Identity and Access Management (IAM).


• IAM policies enable you to allow or deny access to AWS services for users, groups, and roles.
• Service control policies (SCPs) enable you to allow or deny access to AWS services for individuals or group
accounts in an organizational unit (OU).

Organizations setup

Accessing AWS Organizations

• The key components in an organization are:


– Management account (root)
– Organizational unit (OU)
– Member account
– Service control policy (SCP)
TAGGING

A tag:
• Is a key-value pair that can be attached to an AWS resource.
• Enables you to identify and categorize resources.

Tag characteristics

AWS Config and tagging

AWS Config provides a mechanism to enforce tagging on a resource:


• Use the required-tags managed rule.
• Specify the required tag key (and optionally the required value).
• Evaluates rules and identifies non-compliant resources.

AWS Config provides AWS managed rules, which are predefined, customizable rules that AWS Config uses to
evaluate whether your AWS resources comply with common best practices. You can customize the behavior of
a managed rule to suit your needs. For example, you could use the required-tags managed rule to quickly
assess whether a specific tag is applied to your resources. This rule enables you to specify the key of the
required tag and, optionally, its value. After you activate the rule, AWS Config compares your resources to the
defined conditions and reports any non-compliant resources. The evaluation of a managed rule can occur
when a resource changes, or on a periodic basis.
Use an IAM policy to require the use of specific tags on a resource, and AWS Config to periodically verify that
all resources are tagged.

Tagging best practices


COST MANAGEMENT AND BEST PRACTICES

Cost benefits of the AWS Cloud:

• Pay only for what you need, when you need it.
• Create scripts or templates to shut down environments.
• Can turn off unused resources
– Specific services after business hours and during holidays
– Development or test environments
– Disaster recovery (DR) environments
– Instances that are tagged as temporary

AWS cost management tools and services

AWS Cost Explorer


AWS Cost Explorer enables you to view your costs and usage, and to analyse them to identify trends. You can
filter and group data along various dimensions, such as service, instance type, and tag. Cost Explorer provides
you with two types of default reports:

• Cost and Usage reports –These reports enable you to understand your costs and usage for all services. For
example, the Monthly costs by service report (displayed in the screen capture) shows your costs for the last 3
months, grouped by service. The top five services are shown by themselves, and the rest are grouped into one
bar (labelled Others).

• Reserved Instance (RI) reports –These reports are specific to your Reserved Instances usage. They provide an
understanding of your comparative utilization costs for Reserved Instances versus On-Demand Instances.

You can view data for up to the last 13 months, forecast how much you are likely to spend for the next 3
months, and get recommendations for which Reserved Instances to purchase.

If you have many accounts and have enabled consolidated billing for AWS Organizations, you can use AWS
Cost Explorer to view costs across all your linked accounts. You can also monitor the individual daily and
monthly spend for each linked account.
AWS Budgets
AWS Budgets enables you to set custom budgets that alert you when costs or usage exceed (or are forecasted
to exceed) your budgeted amount. AWS Budgets uses the cost visualization that is provided by Cost Explorer to
show you the status of your budgets and to provide forecasts of your estimated costs. You can also use
Budgets to create notifications if you go over your budgeted amounts, or when your estimated costs exceed
your budgets. Budgets can be tracked at the monthly, quarterly, or yearly level. You can customize the start
and end dates. Budget alerts can be sent through email or through an Amazon Simple Notification Service
(AmazonSNS) topic.

AWS Cost and Usage Reports


The AWS Cost and Usage Reports page is a single location for accessing comprehensive information about your
AWS costs and usage. You can use it to generate reports that contain line items for each unique combination
of AWS products, usage type, and operation that you use in your AWS account. You can customize the
generated reports to aggregate the information either by the hour or by the day. You can also publish your
AWS billing reports to an Amazon Simple Storage Service (Amazon S3) bucket, and AWS will update the reports
in your bucket once a day.

Amazon CloudWatch billing alarms

• Generate an alert when estimated charges exceed a specified threshold


• Enabled in the AWS Management Console
• Must be created in the us-east-1 Region
–Central storage for all billing metrics
• Based on metrics that includes total and service-specific charges
• Send email notifications through an Amazon Simple Notification Service (Amazon SNS) topic

Designing for cost reduction

Finding and eliminating waste

• Using Amazon CloudWatch metrics to find long-running idle instances.


–Sometimes, unneeded resources are still running.
• Using AWS Cost Explorer to find the costs that are associated with entire projects or initiatives.
Using a stopinator script

• Turn on and turn off selected AWS resources


• Is a best practice to reduce cost

Writing and using a stopinator script is a technique for automating the shutdown of instances. A stopinator is a
generic term for any script or application that is written against the AWS Cloud, and that looks for and stops
unused instances.

Serverless stopinator
You do not need to create or use an Amazon Elastic Compute Cloud (Amazon EC2) instance to run a stopinator.
A simple and efficient design is to use a combination of a Lambda function and an Amazon CloudWatch Events
event in a serverless solution. The logic to stop and start an instance is implemented as a Lambda function.
This function is then triggered by a CloudWatch Events event according to the desired schedule.

What is AWS Trusted Advisor?

AWS Trusted Advisor is an online resource to help you reduce cost, increase performance, and improve
security by optimizing your AWS environment.
AWS Trusted Advisor analyses your AWS environment and provides recommendations for best practices in five
categories:
• Cost optimization – Advice about how you can save money by eliminating unused and idle resources, or
making commitments to reserved capacity.
• Performance – Advice about how to improve the performance of your services by checking your service
limits, ensuring that you use provisioned throughput, and monitoring for overutilized instances.
• Security – Advice about how to improve the security of your applications by closing gaps, enabling various
AWS security features, and examining your permissions.
• Fault tolerance – Advice about how to increase the availability and redundancy of your AWS applications by
using automatic scaling, health checks, Multi-AZ deployment, and backup capabilities.
• Service limits – Advice about the services whose usage exceeds 80 percent of their service limit.
AWS Trusted Advisor cost optimization features
You can use AWS Trusted Advisor to identify idle resources, such as EC2 instances, underused load balancers
and volumes, and unused Elastic IP addresses. Trusted Advisor is also a good tool for cost optimization. It
provides checks and recommendations that enable you to achieve cost savings.

• Sample cost optimization checks –


– Idle resources, such as Amazon Elastic Compute Cloud (Amazon EC2) instances, Amazon Relational Database
Service (Amazon RDS) instances
– Underused load balancers and volumes
– Unused Elastic IP addresses
• Use cost-optimization checks to achieve a base level of cost savings
• Core checks and recommendations are available to all customers
• Additional checks and recommendations are available with Business Support or Enterprise Support plans

AWS Trusted Advisor recommendations


AWS Trusted Advisor analyses your AWS environment and provides recommendations for best practices.
Recommendations include links to take direct action. AWS Trusted Advisor’s real-time guidance helps you
provision your resources according to AWS best practices.
AWS SUPPORT SERVICES

AWS Support provides a mix of tools and technology, people, and programs. The AWS Support resources are
designed to proactively help you optimize performance, lower costs, and innovate faster.

Expertise and support


• Provide a unique combination of tools and expertise –
– AWS Support
– AWS Support Plans

Types of support
• Support is provided for –
– Experimenting with AWS
– Production use of AWS
– Business-critical use of AWS

AWS Support plans

AWS Support offers four plans:


Basic Support – Resource centre access, Service Health dashboard, product FAQs, discussion forums, and
support for health checks
Developer Support – Support for early development on AWS
Business Support – Support for customers that run production workloads
Enterprise Support – Support for customers that run business and mission-critical workloads

Benefits of AWS Support services

Build faster – Use AWS experts to quickly build knowledge and expertise.
Mitigate risks – AWS Support can help you maintain the strictest security standards and proactively alert you
to issues that require attention.
Management resources – Proactively monitor your environment and automate remediation.
Get expert help – Cloud support engineers work at the same standards for technical aptitude as the AWS
software development organization.

AWS Support guidance and assistance

Proactive guidance – Technical Account Manager (TAM)


Best practices – AWS Trusted Advisor
Account assistance – AWS Support Concierge
AWS Support technology and programs

Technology
• AWS Personal Health Dashboard provides alerts and remediation guidance if AWS experiences events that
might impact customers.
• AWS Trusted Advisor is an online resource that checks for opportunities to reduce monthly expenditures
and increase productivity.
• AWS Health API provides programmatic access to the AWS Health information that is in the Personal Health
Dashboard.

Programs
• AWS Infrastructure Event Management (IEM) provides guidance for architecture and scaling. They also offer
operational support during planned events, such as shopping holidays.
• Architectural reviews with AWS solutions architects are included with Enterprise Support.
– AWS Well-Architected helps cloud architects build secure, resilient, and efficient infrastructure for their
applications and workloads.
• Proactive services that are delivered by AWS Support experts are included with Enterprise Support.

The role of AWS Support

AWS Technical Support tiers cover development and production issues for AWS products and services.
How-to – Find resources to assist customers and answer their questions about AWS services and features
Best practices – Help customers successfully integrate, deploy, and manage applications in the cloud
Troubleshooting – Help customers with issues about application programming interfaces (APIs) and AWS
software development kits (SDKs)
Troubleshooting – Help customers with operational or systemic issues with AWS resources
Issues – Identify issues with the AWS Management Console or other AWS tools
Problems detected – Help customers with issues that were detected by Amazon Elastic Compute Cloud
(Amazon EC2) health checks

AWS Support Plans: Pricing and services


Plan service levels

AWS Support works with five different severity levels:


Critical – The customer’s business is at risk. Critical functions of their application are unavailable.
Urgent – The customer’s business is significantly impacted. Important functions of their application are
unavailable.
High – Important functions of the customer’s application are impaired or degraded.
Normal – Non-critical functions of the customer’s application are behaving abnormally, or the customer has a
time-sensitive development question.
Low – The customer has a general development question, or they want to request a feature.

AWS Trusted Advisor


AWS Trusted Advisor is an online resource that you can access from the Management Tools section of the AWS
Management Console. It helps users follow best practices that increase the performance and fault tolerance of
their AWS solutions. Trusted Advisor provides real-time guidance to help you reduce costs, increase
performance, and improve security by optimizing your AWS environment.

Trusted Advisor does not focus on only one service, and it is not only a security tool. For example, Trusted
Advisor can tell you how the infrastructure is performing and when security groups have been left open. It can
tell you whether you are using fault tolerance and if you are at risk with all the resources that you deployed in
an Availability Zone. It can also tell you if you have deployed resources that you are not using, but are still
being charged for.

Trusted Advisor offers two options:


• Core checks and Recommendations are available for all accounts.
• Full Trusted Advisor is available for Business Support and Enterprise Support offerings.

AWS whitepapers and documentation


AWS whitepapers are a collection of technical documents that outline many topics that are relevant to AWS,
like architecting best practices, security best practices, cloud computing economics, and serverless
architecture.
These technical documents cover a range of ideas, thoughts, and concepts that apply to cloud computing and
AWS services.

Collection of technical documents that outline AWS topics, including:


– Architecture best practices
– Security best practices
– Cloud computing economics
– Serverless architecture
CONFIGURATION MANAGE MENT IN THE CLOUD

Benefits of configuration management

• Increase efficiency
• Validate every change before release
• Reduce cost by removing unwanted resources
• Enforce security at every layer
• Deploy configuration changes to running instances
• Make configuration automated and repeatable

Configuration management tools

Technologies for configuring EC2 instances

User data – Enables you to author scripts that are run on instance launch.
Amazon Machine Images (AMIs) – By creating base images that are customized to the needs of your
organization, you can pre-deploy installations and configurations into the EC2 instances that are launched
from the AMI.
Configuration and deployment frameworks – Technologies such as Chef, Puppet, and Ansible enable you to
configure new instances by using templates.
AWS OpsWorks – A configuration management service that provides managed instances of Chef and Puppet.
AWS CloudFormation – An AWS service that enables you to configure architectures for repeatable
deployments.

Benefits of using a configuration server

• Can greatly simplify common administrative tasks.


• Offers a configuration that is idempotent.
– Resources are allocated only once.
– Manual changes are detected and rolled back.
• Supply user data to instance to kick off client configuration.
– Install client and any required configuration and security credentials.
– Specify the templates to be run.
AMI BUILDING STRATEGY

Custom AMIs as a base configuration

1. Launch an EC2 instance from a standard AMI.


2. Preconfigure all the software that your organization requires on an Amazon EC2 instance.
3. Create a custom AMI from that instance.

The new custom AMI then becomes the AMI that is used to create all new instances in the organization.
To enforce the policy that all new instances are launched only from the new base AMI, do the following:
• Create processes that scan the running Amazon EC2 instances in your account.
• Terminate any instances that are not using the standard AMIs.

Another option is to configure instances at boot time. An example of configuring an instance at boot time is
the use of the user data option to run a script when you launch an EC2 instance.

Creating AMIs

Creation of AMIs results in the following:


• Resulting AMI is anchored to the current AWS Region.
• Instance is rebooted by default to ensure consistency.
• Amazon Elastic Block Store (Amazon EBS)-backed AMIs are created with all attached volumes.

To create an AMI, you can use any one of the following tools:
• AWS Management Console
• AWS Command Line Interface (AWS CLI)
• AWS application programming interface (API)

AMI creation details

• Costs are incurred for Amazon EBS snapshots of volumes stored in Amazon S3.
• Create Linux AMIs directly from an Amazon EC2 instance root volume snapshot using one of two tools:
– AWS Management Console
– AWS CLI command: aws ec2 register-image
AMAZON EC2 LAUNCH TEMPLATES

Create templates for EC2 instance launch requests

• Contain configuration information to launch an EC2 instance.


• Store launch parameters: – Amazon Machine Image (AMI) ID
– Instance type
– Subnet
– Key pair
• Specify the launch template to use when you launch instances.

Versions of a launch template

• Each version can have different launch parameters.


• Use any version of the launch template.
• Set any version of the launch template as the default version.
• By default, the default template is the first version of the template.
INFRASTRUCTURE AS CODE

• Use infrastructure as code (IaC) to automate resource provisioning in the cloud consistently and reliably.
• AWS CloudFormation is one of the AWS tools for managing resources with IaC.
• After deployment, you can manage the infrastructure using other tools, such as AWS System Manager or
AWS OpsWorks. In both cases, use code to ensure consistency and reliability

INTRODUCTION TO JSON AND YAML

The language of infrastructure

• Infrastructure as code (IaC) is a method to declare the resources that are needed in the cloud in using text
files
• JavaScript Object Notation (JSON) and YAML Ain’t Markup (YAML) syntaxes are used by AWS to declare
resources in the cloud
• Enables the definition of simple to complex infrastructures in text
• Understanding the syntaxes of JSON & YAML are required to build infrastructure as code

JSON
– Syntax for storing and transporting data.
– Text-based format, so it is human-readable.
– Documents are easily written.
– Stores key-value pairs and arrays of data.
Key – Unique identifier for an item of data.
Value – Data that is identified or a pointer to the location of that data.

Advantages:
• Lightweight (minimal syntax and mockup) — good for application programming interfaces (APIs).
• Easy for humans to read and write.
• Easy for machines to parse and generate.
Disadvantages:
• No native support for binary data (such as image files).

YAML
– Syntax for storing data.
– Text-based format, so it is human-readable.
– Documents are easily written.
– Store key-value pairs, lists, and associative arrays of data.
– Store complex data structures in a single YAML document.
AWS CLOUDFORMATION

Cloud deployment challenges

AWS CloudFormation

•Models and provisions cloud infrastructure resources


• Supports most AWS services
• Creates, updates, and deletes a set of resources as a single unit called a stack
• Detects changes, called “drift”, on stack and individual resources

AWS CloudFormation Terminology

A template is a specification of the AWS resources to be provisioned.

A stack is a collection of AWS resources that were created from a template. You might provision (create) a
stack many times.
When a stack is provisioned, the AWS resources that are specified by the stack template are created. Any
charges incurred from using these services will start accruing when they are created as part of the AWS
CloudFormation stack.
When a stack is deleted, the resources that are associated with that stack are deleted. The order of deletion is
determined by AWS CloudFormation. You do not have direct control over what gets deleted when.

Launch and delete stacks

• AWS CloudFormation templates can be launched as stacks through:


– AWS Management Console
– AWS CLI
– AWS APIs
• If an error is encountered when you launch a template, all resources are rolled back by default
• When stacks are deleted, resources are rolled back
– You can optionally enable termination protection on a stack

• Parameters enable you to input custom values to your template each time you create or update a stack.
AWS re/start

Additional
Services
CLOUD ADOPTION FRAME WORK (CAF)

The AWS CAF leverages AWS experience and best practices to help you digitally transform and accelerate your
business outcomes through the innovative use of AWS.

Benefits

AWS CAF Perspectives

AWS CAF groups its capabilities in perspectives: • Business


• People
• Governance
• Platform
• Security
• Operations

How it works

Each organization’s cloud journey is unique.


Successful journeys share the following phases:
ADDITIONAL AWS TOPICS

Analytics services

AWS Data Exchange is the world’s most comprehensive service for third-party datasets. AWS Data Exchange is
the only data marketplace with more than 3,500 products from over 300 providers delivered—through files,
APIs, or Amazon Redshift queries—directly to the data lakes, applications, analytics, and machine learning (ML)
models that use it. With AWS Data Exchange, the user can streamline all third-party data consumption, from
existing subscriptions—which the user can migrate at no additional cost—to future data subscriptions in one
place. As an AWS service, AWS Data Exchange is secure and compliant, integrated with AWS and third-party
tools and services, and offers consolidated billing and subscription management.

Amazon EMR is a web service that efficiently processes vast amounts of data by using Apache Hadoop and
AWS services.

AWS Glue is a scalable, serverless data integration service to discover, prepare, and combine data for
analytics, ML, and application development.

Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service to build and run
applications that use Apache Kafka to process streaming data without needing Apache Kafka infrastructure
management expertise. Apache Kafka is an open source platform for building real-time streaming data
pipelines and applications. However, Apache Kafka is difficult for users to architect, operate, and manage on
their own.

Amazon OpenSearch Service is a managed service to deploy, operate, and scale OpenSearch Service clusters in
the AWS Cloud. OpenSearch Service supports OpenSearch and legacy Elasticsearch OSS (up to 7.10, the final
open source version of the software).

Application integration

Amazon EventBridge is used to route events from sources such as homegrown applications, AWS services, and
third-party software to consumer applications across the organization. EventBridge provides a consistent way
to ingest, filter, transform, and deliver events so users can build new applications quickly. EventBridge event
buses are well suited for many-to-many routing of events between event-driven services. EventBridge Pipes is
intended for point-to-point integrations between these sources and targets, with support for advanced
transformations and enrichment.

AWS Step Functions is a serverless orchestration service for integrating with AWS Lambda functions and other
AWS services to build business-critical applications. Through the Step Functions graphical console, the user
sees their application’s workflow as a series of event-driven steps. Step Functions is based on state machines
and tasks. In Step Functions, a workflow is called a state machine, which is a series of event-driven steps. Each
step in a workflow is called a state. A Task state represents a unit of work that another AWS service, such as
Lambda, performs. A Task state can call any AWS service or API.
Business productivity

Amazon Connect is an omnichannel cloud contact center. The user can set up a contact center in a few steps,
add agents who are located anywhere, and start engaging with customers.

Amazon Simple Email Service (Amazon SES) is an email platform that provides a cost-effective way for users
to send and receive email messages by using their own email addresses and domains.

Compute

AWS Local Zones are a type of infrastructure deployment that places compute, storage, database, and other
select AWS services close to large population and industry centers.

AWS Outposts is a family of fully managed solutions delivering AWS infrastructure and services to virtually any
on-premises or edge location for a truly consistent hybrid experience. With Outposts solutions, the user can
extend and run AWS services on premises, and Outposts is available in a variety of form factors. With
Outposts, the user can run some AWS services locally and connect to a broad range of services available in the
local AWS Region. Users can also use Outposts to run applications and workloads on premises by using familiar
AWS services, tools, and APIs. Outposts supports workloads and devices that require low latency access to on-
premises systems, local data processing, data residency, and application migration with local system
interdependencies.

With AWS Wavelength, developers can build applications that deliver ultra-low latencies to mobile devices
and end users. AWS Wavelength deploys standard AWS compute and storage services to the edge of
communications service providers' 5G networks. The user can extend a virtual private cloud (VPC) to one or
more Wavelength Zones. The user can then use AWS resources such as Amazon Elastic Compute Cloud
(Amazon EC2) instances to run the applications that require ultra-low latency and a connection to AWS
services in the Region.

Containers

Amazon Elastic Container Registry (Amazon ECR) is an AWS managed container image registry service that is
secure, scalable, and reliable. It supports private repositories with resource-based permissions by using AWS
Identity and Access Management (IAM) so that specified users or EC2 instances can access their container
repositories and images. The user can use their preferred command line interface (CLI) to push, pull, and
manage Docker images, Open Container Initiative (OCI) images, and OCI-compatible artifacts. Amazon ECR also
supports public container image repositories. The AWS container services team maintains a public road map
on GitHub. It contains information about what the teams are working on and gives all AWS customers the
ability to provide direct feedback.

Customer engagement

AWS Activate for startups provides eligible startups with free tools, resources, and content designed to help
startups reach their goals.

Professionals use AWS IQ to find and engage experts on AWS. All experts on AWS IQ who respond to custom
requests are AWS Certified and must maintain a high success rate.
Databases

Amazon MemoryDB for Redis is a Redis-compatible, durable, in-memory database service that delivers ultra-
fast performance. It is purpose-built for modern applications with microservices architectures.

Amazon Neptune is a fast, reliable, fully managed graph database service used to build and run applications
that work with highly connected datasets.

Developer tools

AWS AppConfig is a capability of AWS Systems Manager to create, manage, and quickly deploy application
configurations. A configuration is a collection of settings that influence the behavior of an application. AWS
AppConfig can be used with applications hosted on EC2 instances, Lambda, containers, mobile applications, or
Internet of Things (IoT) devices. AWS AppConfig helps deploy application configuration in a managed and a
monitored way just like code deployments but without the need to deploy the code if a configuration value
changes. With AWS AppConfig, users can update configurations by entering changes through the API or the
AWS Management Console.

AWS CloudShell is a browser-based shell to securely manage, explore, and interact with AWS resources.
CloudShell is pre-authenticated with the user’s console credentials. Common development and operations
tools are pre-installed, so there’s no need to install or configure software on the local machine. With
CloudShell, users can quickly run scripts with the AWS Command Line Interface (AWS CLI), experiment with
AWS service APIs by using the AWS SDKs, or use a range of other tools to be more productive.

AWS CodeArtifact is a fully managed artifact repository service that organizations of any size can use to
securely store, publish, and share software packages used in their software development process. CodeArtifact
works with commonly used package managers and build tools such as Maven and Gradle (Java), npm and yarn
(JavaScript), pip and twine (Python), and NuGet (.NET).

AWS X-Ray is a service that collects data about requests that the user’s application serves, and provides tools
to view, filter, and gain insights into that data to identify issues and opportunities for optimization. For any
traced request to an application, users can see detailed information, not only about the request and response,
but also about calls that the application makes to downstream AWS resources, microservices, databases, and
web APIs.

End-user computing

Amazon AppStream 2.0 is an AWS End User Computing (EUC) service that can be configured for software as a
service (SaaS) application streaming or delivery of virtual desktops with selective persistence.

Amazon WorkSpaces is a fully managed desktop virtualization service for Windows, Linux, and Ubuntu that
gives the user the ability to access resources from any supported device.

Amazon WorkSpaces Web is a low cost, fully managed, Linux-based service that is designed to facilitate secure
browser access to internal websites and SaaS applications from existing web browsers without the
administrative burden of appliances, managed infrastructure, specialized client software, or virtual private
network (VPN) connections.
Frontend web and mobile

AWS Amplify is a complete solution for frontend web and mobile developers to build, ship, and host full-stack
applications on AWS with the flexibility to leverage the breadth of AWS services as use cases evolve. No cloud
expertise is needed.

AWS AppSync creates serverless GraphQL and Pub/Sub APIs that simplify application development through a
single endpoint to securely query, update, or publish data.

AWS Device Farm is an application testing service for users to improve the quality of their web applications
and mobile apps by testing them across an extensive range of desktop browsers and real mobile devices. With
Device Farm, users don’t have to provision and manage any testing infrastructure.

Internet of Things

AWS IoT Core connects billions of Internet of Things (IoT) devices and routes trillions of messages to AWS
services without managing infrastructure.

AWS IoT Greengrass is an open source edge runtime and cloud service for building, deploying, and managing
device software.

Machine learning

Amazon Comprehend is a natural-language processing (NLP) service that uses ML to uncover valuable insights
and connections in text.

Amazon Kendra is an intelligent enterprise search service that helps the user search across different content
repositories with built-in connectors.

Amazon Lex is a fully managed artificial intelligence (AI) service with advanced natural language models to
design, build, test, and deploy conversational interfaces in applications.

Amazon Polly uses deep learning technologies to synthesize natural-sounding human speech so that the user
can convert articles to speech. With dozens of lifelike voices across a broad set of languages, Amazon Polly
helps users build speech-activated applications.

Amazon Rekognition offers pre-trained and customizable computer vision (CV) capabilities to extract
information and insights from images and videos.

Amazon SageMaker is a fully managed ML service. With SageMaker, data scientists and developers can quickly
build and train ML models and then deploy them into a production-ready hosted environment.

Amazon Textract is an ML service that automatically extracts text, handwriting, and data from scanned
documents. It goes beyond optical character recognition (OCR) to identify, understand, and extract data from
forms and tables.

Amazon Transcribe provides transcription services for audio files and audio streams. It uses advanced ML
technologies to recognize spoken words and transcribe them into text.

Amazon Translate is a neural machine translation service that delivers fast, high-quality, affordable, and
customizable language translation.
Management and governance

AWS Compute Optimizer recommends optimal AWS compute resources for workloads. It can help reduce
costs and improve performance by using ML to analyze historical utilization metrics. Compute Optimizer helps
the user to choose the optimal resource configuration based on utilization data.

With AWS Control Tower, users can enforce and manage governance rules for security, operations, and
compliance at scale across all their organizations and accounts in the AWS Cloud.

The AWS Health Dashboard is the single place to learn about the availability and operations of AWS services.
The user can view the overall status of AWS services, and they can sign in to view personalized
communications about their particular AWS account or organization. The account view provides deeper
visibility into resource issues, upcoming changes, and important notifications.

AWS Launch Wizard offers a guided way of sizing, configuring, and deploying AWS resources for third-party
applications, such as Microsoft SQL Server Always On and HANA-based SAP systems, without the need to
manually identify and provision individual AWS resources.

AWS Resource Groups manages and automates tasks on large numbers of resources at one time. A user can
use resource groups to organize their AWS resources, and tags are key and value pairs that act as metadata for
organizing those resources.

With AWS Service Catalog, IT administrators can create, manage, and distribute portfolios of approved
products to end users, who can then access the products they need in a personalized portal. Typical products
include servers, databases, websites, or applications that are deployed by using AWS resources (for example,
an EC2 instance or an Amazon Relational Database Service [Amazon RDS] database).

Migration and transfer

The AWS Application Discovery Service helps systems integrators quickly and reliably plan application
migration projects by automatically identifying applications running in on-premises data centres, their
associated dependencies, and their performance profile.

AWS Application Migration Service is a highly automated lift-and-shift (rehost) solution that simplifies,
expedites, and reduces the cost of migrating applications to AWS. Companies can use this service to lift and
shift a large number of physical, virtual, or cloud servers without compatibility issues, performance disruption,
or long cutover windows.

AWS Migration Hub provides a single location to track migration tasks across multiple AWS tools and partner
solutions. With Migration Hub, users can choose the AWS and partner migration tools that best fit their needs
while providing visibility into the status of their migration projects.

AWS Transfer Family is a secure transfer service to transfer files into and out of AWS storage services. Transfer
Family is part of the AWS Cloud platform.
Security, identity, and compliance

AWS Audit Manager helps users continually audit their AWS usage to simplify how they manage risk and
compliance with regulations and industry standards. Audit Manager automates evidence collection so users
can assess whether their policies, procedures, and activities—also known as controls—are operating
effectively.

AWS Directory Service provides multiple ways to set up and run Microsoft Active Directory with other AWS
services, such as Amazon EC2, Amazon RDS for SQL Server, Amazon FSx for Windows File Server, and AWS IAM
Identity Center (successor to AWS Single Sign-On).

AWS Firewall Manager simplifies a user’s AWS WAF administration and maintenance tasks across multiple
accounts and resources. With Firewall Manager, users set up their firewall rules only once. The service
automatically applies these rules across accounts and resources, even as new resources are added.

With AWS IAM Identity Center (successor to AWS Single Sign-On), a user can manage sign-in security for their
workforce identities, also known as workforce users. IAM Identity Center provides one place where users can
create or connect workforce users and centrally manage their access across all their AWS accounts and
applications. Users can use multi-account permissions to assign their workforce users access to AWS accounts.

AWS Key Management Service (AWS KMS) is an encryption and key management service scaled for the cloud.
Other AWS KMS keys and functionality are used by other AWS services, and a user can use them to protect
data in their own applications that use AWS.

AWS Network Firewall is a stateful, managed, network firewall and intrusion detection and prevention service
for a user’s VPC that is created in Amazon Virtual Private Cloud (Amazon VPC). With Network Firewall, a user
can filter traffic at the perimeter of a VPC. This includes filtering traffic going to and coming from an internet
gateway, NAT gateway, or over VPN or AWS Direct Connect.

AWS Resource Access Manager (AWS RAM) helps users securely share their resources across AWS accounts,
within their organization or organizational units (OUs) in AWS Organizations, and with IAM roles and IAM users
for supported resource types. A user can use AWS RAM to share resources with other AWS accounts.

AWS Secrets Manager helps a user to securely encrypt, store, and retrieve credentials for databases and other
services. Instead of hardcoding credentials in applications, a user can make calls to Secrets Manager to retrieve
credentials whenever needed. Secrets Manager helps protect access to IT resources and data by giving users
the ability to rotate and manage access to their secrets.

AWS Security Hub provides users with a comprehensive view of their security state in AWS and helps them
check their environment against security industry standards and best practices. Security Hub collects security
data from across AWS accounts, services, and supported third-party partner products and helps users analyze
their security trends and identify the highest priority security issues.

Storage

AWS Elastic Disaster Recovery minimizes downtime and data loss with fast, reliable recovery of on-premises
and cloud-based applications by using affordable storage, minimal compute, and point-in-time recovery.

Amazon FSx makes it cost-effective to launch, run, and scale feature-rich, high-performance file systems in the
cloud. It supports a wide range of workloads with its reliability, security, scalability, and broad set of
capabilities.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy