0% found this document useful (0 votes)
214 views

Get The Most, From The Best!!

The document discusses how to use AWS services like Elastic Load Balancing, Auto Scaling, and Route 53 to build a scalable and on-demand architecture. Elastic Load Balancing distributes traffic across instances, Auto Scaling automatically scales resources up or down based on conditions, and Route 53 provides DNS services. Together these services allow scaling capacity dynamically to only pay for resources when needed and maintain availability.

Uploaded by

Mangesh Abnave
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
214 views

Get The Most, From The Best!!

The document discusses how to use AWS services like Elastic Load Balancing, Auto Scaling, and Route 53 to build a scalable and on-demand architecture. Elastic Load Balancing distributes traffic across instances, Auto Scaling automatically scales resources up or down based on conditions, and Route 53 provides DNS services. Together these services allow scaling capacity dynamically to only pay for resources when needed and maintain availability.

Uploaded by

Mangesh Abnave
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Get the Most, from the Best!!

Get the Most, from the Best!!

Topics Labs
 Elastic Load Balancing
 Lab3:
 Amazon EC2 Auto Scaling Using Auto Scaling in AWS
 Amazon Route 53
Get the Most, from the Best!!

 Buying hardware to meet demand creates waste


◦ Hardware is idle during off-peak periods and constrained during peak
periods
 Scale on demand
◦ Scale out for spikes
◦ Scale in during off-peak
◦ Replace unhealthy instances
◦ Pay only for what you use

In a traditional data center environment, the scalability of your system is bound by your
hardware. Take the example of a tax preparation business in the United States. US
taxpayers must file their taxes by April 15. Online tax preparation companies know that
they will experience a steady flow of traffic starting near the middle of January, with
traffic peaking close to the April 15 deadline. In a data center, anticipating this four-
month period of heavy utilization requires spinning up enough physical servers to
handle the anticipated load. But what happens to those servers the rest of the year?
They sit idle in the data center.

In the cloud, because computing power is a programmatic resource, we can take a


more flexible approach to the issue of scaling. We can program our system to create
new Amazon EC2 instances in advance of known peak periods in a business cycle (such
as tax filing deadlines). However, we can go a step further using monitoring, and can
programmatically scale out our servers when we notice that critical resources—such as
average CPU utilization across our fleet—are becoming constrained. Furthermore, we
can automatically scale in the number of resources we use during peak times when
demand for our business services return to their baseline. In other words, we only pay
for the resources we need, when we need them.

What's required to implement such a system? Let's see how several AWS services can
be used together to create a scalable, on-demand architecture.
Get the Most, from the Best!!

instance instance instance instance instance

Amazon Elastic Load Auto Scaling group


Route 53 Balancing

Elastic Load Balancing (ELB) automatically distributes incoming application traffic


across multiple targets, such as Amazon EC2 instances, containers, IP addresses, and
Lambda functions. It can handle the varying load of your application traffic in a single
Availability Zone or across multiple Availability Zones. ELB offers three types of load
balancers that all feature the high availability, automatic scaling, and robust security
necessary to make your applications fault-tolerant.

Amazon Auto Scaling helps you maintain application availability and allows you to
dynamically scale your capacity up or down automatically according to conditions you
define.

Amazon Route 53 is a highly available and scalable cloud Domain Name System (DNS)
web service.
Get the Most, from the Best!!
Get the Most, from the Best!!

Automatically distribute traffic across multiple targets

High availability
Health checks
Security features
TLS termination
Elastic Load Layer 4 or Layer 7 load balancing
Balancing
Operational monitoring

High Availability
Elastic Load Balancing automatically distributes traffic across multiple targets –
Amazon EC2 instances, containers and IP addresses – in a single Availability Zone or
multiple Availability Zones.

Health Checks
Elastic Load Balancing can detect unhealthy targets, stop sending traffic to them, and
then spread the load across the remaining healthy targets.

Security Features
Use Amazon Virtual Private Cloud (Amazon VPC) to create and manage security
groups associated with load balancers to provide additional networking and security
options. You can also create an internal (non-internet-facing) load balancer.

TLS Termination
Elastic Load Balancing provides integrated certificate management and SSL
decryption, allowing you the flexibility to centrally manage the SSL settings of the
load balancer and offload CPU intensive work from your application.
Layer 4 or Layer 7 Load Balancing
You can load balance HTTP/HTTPS applications for layer 7-specific features, or use
strict layer 4 load balancing for applications that rely purely on the TCP protocol.

Operational Monitoring
Elastic Load Balancing provides integration with Amazon CloudWatch metrics and
request tracing in order to monitor performance of your applications in real time.
Get the Most, from the Best!!

Application Load Balancer Network Load Balancer Classic Load Balancer


(ALB) (NLB) (CLB)

HTTP PREVIOUS GENERATION


TCP
HTTPS for HTTP, HTTPS, and TCP

• Flexible application management • Extreme performance and static IP for • For applications that use the EC2-Classic
• Advanced load balancing of HTTP and your application network
HTTPS traffic • Load balancing of TCP traffic • Operates at both the request level and
• Operates at the request level • Operates at the connection level (Layer connection level
(Layer 7) 4)

Elastic Load Balancing supports three types of load balancers: Application Load Balancers,
Network Load Balancers, and Classic Load Balancers. You can select a load balancer based on
your application needs.

An Application Load Balancer (ALB) functions at the application layer, the seventh layer of the
Open Systems Interconnection (OSI) model. Application Load Balancers support content-
based routing, and supports applications that run in containers. They support a pair of
industry-standard protocols (WebSocket and HTTP/2) and also provide additional visibility into
the health of the target instances and containers. Web sites and mobile apps, running in
containers or on EC2 instances, will benefit from the use of Application Load Balancers. The
Application Load Balancer is ideal for advanced load balancing of HTTP and HTTPS traffic, ALB
provides advanced request routing that supports modern application architectures, including
microservices and container-based applications.

The Network Load Balancer (NLB) is designed to handle tens of millions of requests per
second while maintaining high throughput at ultra low latency, with no effort on your
part. Network Load Balancer operates at the connection level (Layer 4), routing connections
to targets - Amazon EC2 instances, containers and IP addresses based on IP protocol data. The
Network Load Balancer is API-compatible with the Application Load Balancer, including full
programmatic control of Target Groups and Targets. The Network Load Balancer is ideal for
load balancing of TCP traffic, NLB is capable of handling millions of requests per second while
maintaining ultra-low latencies. NLB is optimized to handle sudden and volatile traffic patterns
while using a single static IP address per Availability Zone.
The Classic Load Balancer (CLB) provides basic load balancing across multiple
Amazon EC2 instances and operates at both the request level and connection level.
Classic Load Balancer is intended for applications that were built within the EC2-
Classic network. The Classic Load Balancer is ideal for applications that were built
within the EC2-Classic network.

For a side-by-side feature comparison, see


https://aws.amazon.com/elasticloadbalancing/details/#compare
Get the Most, from the Best!!

ELB-Managed VPC Customer VPC

Creating your load balancer


In the region in which you are launching, ELB will launch load balancer nodes
in a VPC separate from your VPC.
Get the Most, from the Best!!

ELB-Managed VPC Customer VPC

Creating your load balancer


ELB creates two load balancer nodes at launch.

If you only provide one subnet, ELB will launch two nodes in the Availability
Zone of your subnet.
Get the Most, from the Best!!

ELB-Managed VPC Customer VPC

Creating your load balancer


ELB creates two load balancer nodes at launch.

If you only provide one subnet, ELB will launch two nodes in the Availability
Zone of your subnet.

If you provide two subnets in different Availability Zones, ELB will launch
one node in each Availability Zone.
Get the Most, from the Best!!

54.234.123.234

Endpoint
name.region.elb.amazonaws.co
m
ENI IPs
54.234.123.234
54.234.123.235

54.234.123.235

Associating load balancer nodes


ELB takes an ENI from each of your subnets and connects it to the load
balancer node in the same Availability Zone.

ELB associates each node’s IP with the ELB hostname.


Get the Most, from the Best!!

54.234.123.234

54.234.123.235

Load Distribution
Load is distributed to each load balancer node via DNS round robin. Load
balancer nodes distribute HTTP traffic to instances using “least outstanding
requests” algorithm.
Get the Most, from the Best!!

54.234.123.234

54.234.123.235

Cross-Zone Load Balancing


Subnets in different Availability Zones can become out of balance due to an
instance failure or IP caching. Cross-zone load balancing can help keep the
fleet in balance by routing requests across Availability Zones.
Get the Most, from the Best!!

Load Balancer

Rule Listener Rule


Rule Listener Rule

Target Target Target Target Target Target Target

Target Group Health Target Group Health Target Group Health


(example.com) check (example.com/blog) check (example.com/mobile) check

A load balancer serves as the single point of contact for clients. The load balancer
distributes incoming application traffic across multiple targets, such as EC2 instances,
in multiple Availability Zones. This increases the availability of your application. You
add one or more listeners to your load balancer.

A listener checks for connection requests from clients, using the protocol and port
that you configure, and forwards requests to one or more target groups, based on the
rules that you define. Each rule specifies a target group, condition, and priority. When
the condition is met, the traffic is forwarded to the target group. You must define a
default rule for each listener, and you can add rules that specify different target
groups based on the content of the request (also known as content-based routing).

Each target group routes requests to one or more registered targets, such as EC2
instances, using the protocol and port number that you specify. You can register a
target with multiple target groups. You can configure health checks on a per target
group basis. Health checks are performed on all targets registered to a target group
that is specified in a listener rule for your load balancer.
The above diagram illustrates the basic components. Notice that each listener
contains a default rule, and one listener contains another rule that routes requests to
a different target group. One target is registered with two target groups.

For more information, see


https://docs.aws.amazon.com/elasticloadbalancing/latest/application/introduction.html.
Get the Most, from the Best!!

aws elbv2 create-load-balancer --name my-load-balancer


--subnets subnet-12345678 subnet-23456789 --security-groups sg-12345678

aws elbv2 create-target-group --name my-targets --protocol HTTP --port 80


--vpc-id vpc-12345678

aws elbv2 register-targets --target-group-arn targetgroup-arn --targets Id=i-12345678 Id=i-


23456789

aws elbv2 create-listener --load-balancer-arn loadbalancer-arn --protocol HTTP


--port 80 --default-actions Type=forward,TargetGroupArn=targetgroup-arn

aws elbv2 describe-target-health --target-group-arn targetgroup-arn

To create your first load balancer, complete the following steps.

• Use the create-load-balancer command to create a load balancer. You


must specify two subnets that are not from the same Availability Zone.
• Use the create-target-group command to create a target group, specifying
the same VPC that you used for your EC2 instances
• Use the register-targets command to register your instances with your target
group.
• Use the create-listener command to create a listener for your load
balancer with a default rule that forwards requests to your target group.
• (Optional) You can verify the health of the registered targets for your target group
using this describe-target-health command.
Get the Most, from the Best!!
Get the Most, from the Best!!

 Automatically launch or terminate Amazon EC2 instances


based on:
◦ Health status checks
◦ User-defined policies driven by Amazon CloudWatch
◦ Schedules
◦ Other criteria (i.e., programmatically)
◦ Manually using set-desired-capacity
 Scale out to meet demand, scale in to reduce costs

AWS Auto Scaling monitors your applications and automatically adjusts capacity to
maintain steady, predictable performance at the lowest possible cost. Using AWS
Auto Scaling, it’s easy to setup application scaling for multiple resources across
multiple services in minutes. The service provides a simple, powerful user interface
that lets you build scaling plans for resources including Amazon EC2 instances and
Spot Fleets, Amazon ECS tasks, Amazon DynamoDB tables and indexes, and Amazon
Aurora Replicas. AWS Auto Scaling makes scaling simple with recommendations that
allow you to optimize performance, costs, or balance between them. If you’re already
using Amazon EC2 Auto Scaling to dynamically scale your Amazon EC2 instances, you
can now combine it with AWS Auto Scaling to scale additional resources for other
AWS services. With AWS Auto Scaling, your applications always have the right
resources at the right time.

It’s easy to get started with AWS Auto Scaling using the AWS Management Console,
Command Line Interface (CLI), or SDK. AWS Auto Scaling is available at no additional
charge. You pay only for the AWS resources needed to run your applications and
Amazon CloudWatch monitoring fees.

For more information, see https://aws.amazon.com/autoscaling


Amazon CloudWatch will be discussed in more detail in a later module.
Get the Most, from the Best!!

Launch Configuration/Launch EC2 Auto Scaling Group EC2 Auto Scaling Policy
Template
Logical group of EC2 instances Parameters for performing an
Instance configuration to be Amazon EC2 auto scaling action
launched: Automatically scale between:
• AMI • Min How to trigger policies?
• Instance type • Desired (optional) • Amazon CloudWatch-driven
• Security group • Max • Instance failure (health check)
• Instance key pair • Scheduled
• Storage Integration with Elastic Load • Manually
• IAM roles Balancing (optional)
• User data Scale out/in and by how much
Health checks to maintain group size • ChangeInCapacity (+/-#)
Only one active launch configuration • ExactCapacity(#)
at a time. Distribute and balance instances • ChangeInPercent (+/-%)
across Availability Zones.
• Cooldown period (simple scaling)
• Warmup period (step scaling)

Creating a launch configuration works much like creating an individual instance: you
must specify the same characteristics—IAM roles, security groups, storage, instance
type, user data, key pairs, etc. However, you do not specify the VPC or subnet in which
your instances will launch. That will be specified by the Auto Scaling group that uses
your launch configuration.

Launch configurations do allow you to specify one networking option: whether or not
to automatically assign a public IP address to each new instance that is created from
the launch configuration. Note that it is not necessary to select this option if your
instances will be launched in a private subnet behind a public Elastic Load Balancing
load balancer.

There are two basic ways to trigger changes to an EC2 Auto Scaling group. First, you can
define a scaling policy that either scales out or scales in based on a CloudWatch Alarm.
You can define a CloudWatch Alarm—e.g., "Average CPU Utilization > 50%"—that calls
a scaling policy. The policy will specify to either add or remove a fixed number of
instances or to adjust the number of running instances as a percentage of the desired
capacity of the EC2 Auto Scaling group.

Second, you can define a scheduled action. Scheduled actions set a new desired
capacity value at a specific time. You can specify a scheduled action to trigger on a
specific date and time or specify a recurring action that is executed at specific times
throughout a week, month, or year. Scheduled actions are an excellent way to pre-
warm capacity in response to anticipated traffic spikes.

Surge Queue Length is the number of requests queued by the load balancer while
waiting for a back-end instance to accept connections and process the request.
To manually change an EC2 Auto-Scaling group size, use the set-desired-capacity
operation in the following manner: aws autoscaling set-desired-capacity --auto-
scaling-group-name my-asg --desired-capacity 2 --honor-cooldown
Get the Most, from the Best!!

 Maintains health state for instances


 Terminates instances marked Unhealthy
 By default, it uses Amazon EC2 instance status checks
 If behind a load balancer, either can be configured:
◦ Load balancer's instance checks
◦ Amazon EC2 instance checks
 External scripts can trigger instance recycling:
aws autoscaling set-instance-health command

EC2 Auto Scaling maintains health state for instances and terminates instances
marked Unhealthy.
By default, it uses Amazon EC2 instance status checks.
If an Auto Scaling group is behind a load balancer, either the load balancer's instance
checks or the Amazon EC2 instance checks are used.
External scripts can trigger the recycling of an instance with the aws autoscaling set-
instance-health command.

For more information about instance status checks, see


http://docs.aws.amazon.com\AWSEC2\latest\UserGuide\monitoring-system-
instance-status-check.html.
Get the Most, from the Best!!

 Determines which instance is terminated during a scale-in


 Factors affecting termination order
◦ Availability Zone with largest number of instances
◦ Multiple policies will be executed in the order listed

Use EC2 Auto Scaling API TerminateInstanceInAutoScalingGroup to atomically


terminate the specified instance and decrement the Desired Capacity. For example, if
you want to always terminate the instance with the fewest user sessions, you can do
the following:

Use EC2 Auto Scaling API TerminateInstanceInAutoScalingGroup to atomically


terminate the specified instance and decrement the Desired Capacity. For example, to
terminate the instance with the fewest user sessions, determine the id of the
instance with the least user sessions and then call
TerminateInstanceInAutoScalingGroup, passing in the instance id as a parameter.

Note:
• If you call TerminateInstance and then SetDesiredCapacity, you risk of having ASG
relaunch the "failed" instance.
• If you call SetDesiredCapacity and then TerminateInstance, you risk having Auto
Scaling terminate an instance other than the one with the least user sessions
followed by the TerminateInstance call terminating the intended instance, in which
case ASG will re-launch another instance.

For more information, see


http://docs.aws.amazon.com/autoscaling/latest/userguide/as-instance-
termination.html.
Get the Most, from the Best!!

Termination Policy Description

OldestInstance Selects longest-running instance

NewestInstance Selects shortest-running instance

Terminates instance with the oldest configuration


OldestLaunchConfiguration
(default)

ClosestToNextInstanceHour Terminates the instance closest to the next billable hour (default)
Get the Most, from the Best!!

 Set an EC2 Auto Scaling group with the same min,


max, and desired values
 Instance is recreated automatically if it becomes
Amazon EC2
unhealthy or if Availability Zone fails instance

 Still potential down time while instance recycles Min = 1


Max = 1
 Use case: Desired = 1
Maintain a steady state NAT in each Availability Zone

EC2 Auto Scaling health checks allow us to create a "steady state" group that ensures
that a single instance is always running. This is useful for situations such as a Network
Address Translation (NAT) server, which is a single point of failure in a standard
public/private subnet architecture.

To create a steady state group for an instance, first create a launch configuration that
creates the instance. Then, create an EC2 Auto Scaling group with a minimum,
maximum, and desired size of 1. Whenever such an instance is marked as unhealthy
(e.g., an instance check fails, or an external script marks the instance as unhealthy
with a call to EC2 Auto Scaling set-instance-health), the EC2 Auto Scaling group will
terminate the existing instance and create a new one from the group's launch
configuration.

Note that in cases such as deploying a NAT instance, the NAT is still a single point of
failure, and you can still experience significant downtime while a failed NAT instance
is recycling. We cover more advanced strategies for high-availability NAT architecture
in our Advanced Architecting course.
Get the Most, from the Best!!

Scaling based on a schedule allows you to scale your


application in response to predictable load changes.

At a specific date/time Perform scaling action

Scaling based on a schedule allows you to scale your application in response to


predictable load changes. For example, every week the traffic to your web application
starts to increase on Wednesday, remains high on Thursday, and starts to decrease on
Friday. You can plan your scaling activities based on the predictable traffic patterns of
your web application.

To configure your Auto Scaling group to scale based on a schedule, you create a
scheduled action, which tells Amazon EC2 Auto Scaling to perform a scaling action at
specified times. To create a scheduled scaling action, you specify the start time when
the scaling action should take effect, and the new minimum, maximum, and desired
sizes for the scaling action. At the specified time, Amazon EC2 Auto Scaling updates
the group with the values for minimum, maximum, and desired size specified by the
scaling action.
Get the Most, from the Best!!

 Target tracking scaling


 Step scaling
 Simple scaling

Scaling Policy Types


Amazon EC2 Auto Scaling supports the following types of scaling policies:

• Target tracking scaling—Increase or decrease the current capacity of the group


based on a target value for a specific metric. This is similar to the way that your
thermostat maintains the temperature of your home – you select a temperature
and the thermostat does the rest. In the case of Amazon EC2 you set the metric to
be monitored by Auto Scaling and Auto Scaling does the rest.
• Step scaling—Increase or decrease the current capacity of the group based on a
set of scaling adjustments, known as step adjustments, that vary based on the size
of the alarm breach. Amazon EC2 Auto Scaling does not support cooldown periods
for step scaling policies. Therefore, you can't specify a cooldown period for these
policies and the default cooldown period for the group doesn't apply.
• Simple scaling—Increase or decrease the current capacity of the group based on a
single scaling adjustment.

If you are scaling based on a utilization metric that increases or decreases


proportionally to the number of instances in an Auto Scaling group, AWS
recommends that you use target tracking scaling policies. Otherwise, AWS
recommends that you use step scaling policies.
Get the Most, from the Best!!

 Forecast load
 Schedule minimum capacity

Use with:
 Dynamic scaling target tracking
 Applications that have periodic spikes

AWS also provides Predictive Scaling. You can use Predictive Scaling to scale your
Amazon EC2 capacity in advance of traffic changes. Auto Scaling enhanced with
Predictive Scaling delivers faster, simpler, and more accurate capacity provisioning,
resulting in lower cost and more responsive applications.

Predictive Scaling predicts future traffic based on daily and weekly trends, including
regularly-occurring spikes, and provisions the right number of EC2 instances in
advance of anticipated changes. Provisioning the capacity just in time for an
impending load change makes Auto Scaling faster than ever before. Predictive
Scaling’s machine learning algorithms detect changes in daily and weekly patterns,
automatically adjusting their forecasts. This removes the need for manual adjustment
of Auto Scaling parameters over time, making Auto Scaling simpler to configure and
consume.

Predictive Scaling can be configured through the AWS Auto Scaling console, AWS
Auto Scaling APIs via SDK/CLI, and CloudFormation. To get started, navigate to AWS
Auto Scaling page and create a scaling plan for Amazon EC2 resources that includes
Predictive Scaling. Once enabled, you can visualize their forecasted traffic and the
generated scaling actions within a few seconds.
You can use predictive scaling, dynamic scaling, or both. Predictive scaling works by
forecasting load and scheduling minimum capacity; dynamic scaling uses target
tracking to adjust a designated CloudWatch metric to a specific target. The two
models work well together because of the scheduled minimum capacity already set
by predictive scaling.

Predictive scaling is a great match for web sites and applications that undergo
periodic traffic spikes. It is not designed to help in situations where spikes in load are
not cyclic or predictable.
Get the Most, from the Best!!

Average CPU Utilization


Alarms 100%
Scale-in: Remove 1 instance when 40% > average Scale-Out Alarm
80%
CPU > 20%
60%

Scale-in: Remove 2 instances when 20% > average 40%

CPU > 0% 20%


Scale-In Alarm
0%
Scale-out: Add 1 instance when 60% < average
Instance Count
CPU < 80% Max: 20

15
Scale-out: Add 2 instances when 80% < average
CPU < 100% 10

Min 5

0
Get the Most, from the Best!!

Average CPU Utilization


100%
As usage increases, CPU utilization goes up. 80%
Scale-Out Alarm

60%

40%
When CPU utilization is more than 60%
and less than 80%: 20%
Scale-In Alarm
• Scale-out alarm is triggered 0%

• Add 1 instance policy is applied Max: 20


Instance Count

• New instance is launched but not added


to the aggregated group metrics until 15
+1
after the instance warm-up period expires 10

Min 5

0
Get the Most, from the Best!!

Average CPU Utilization


100%
As usage increases, CPU utilization goes up. 80%
Scale-Out Alarm

60%
Instance warm-up period

While waiting for new instances: 40%

• Scale-out alarm is triggered 20%


Scale-In Alarm
• CPU utilization remains high 0%
• Another alarm period is triggered Instance Count
Max: 20
• Since current capacity is still 10 during the
instance warm-up period, and desired capacity is 15

already 11, no additional instances are launched +1 +0


10

Min 5

0
Get the Most, from the Best!!

Average CPU Utilization


100%
As usage increases further, CPU utilization Scale-Out Alarm
goes up. 80%

60%
Instance warm-up period
40%
When CPU utilization is more than 80%
20%
and less than 100%: Scale-In Alarm
0%
• Scale-out alarm is triggered
• The alarm occurs after the instance warm-up Max: 20
Instance Count

period +1
15
• Since the alarm occurred during an instance warm- +1 +0
up period, two instances are launched 10

less the one instance added during the


Min 5
first alarm
• Again, new instances are not added to 0

aggregated group metrics


Get the Most, from the Best!!

Average CPU Utilization


100%
As capacity matches usage, CPU utilization Scale-Out Alarm
80%
stabilizes.
60%
Instance warm-up period
40%

When CPU utilization is more than 40% 20%


and less than 60%: 0%
Scale-In Alarm

• No alarms are triggered Instance Count


• After the instance warm-up period expires, new Max: 20

instances are added to the aggregated 15 +2


group metrics +1 +0
10

Min 5

0
Get the Most, from the Best!!

Average CPU Utilization


100%
As usage decreases, CPU utilization goes down. 80%
Scale-Out Alarm

60%

When CPU utilization is more than 0% 40%

and less than 20%: 20%


Scale-In Alarm
• Scale-in alarm is triggered 0%
• Remove 1 instance step policy is applied Instance Count
Max: 20
• An instance is removed from the EC2 Auto
Scaling group and from the aggregated group 15 +2
metrics +1 +0 -1
10

Min 5

0
Get the Most, from the Best!!

Average CPU Utilization


100%
Scale-Out Alarm
As capacity matches usage, CPU utilization 80%
goes down further. 60%

40%

When CPU utilization is more than 0% 20%


Scale-In Alarm
and less than 20%: 0%

• Scale in alarm is triggered Max: 20


Instance Count

• Remove 2 instances step policy is applied


+2
• Two instances are removed from the 15
+1 +0 -1 -2
EC2 Auto Scaling group and from the 10
aggregated group metrics
Min 5

0
Get the Most, from the Best!!

Average CPU Utilization


100%
As capacity matches usage, CPU utilization Scale-Out Alarm
80%
stabilizes.
60%

40%
When CPU utilization is more than 40%
20%
and less than 60%: Scale-In Alarm
• No alarm is triggered 0%

• No step policies are applied Max: 20


Instance Count

• No instances are added or removed


15 +2
+1 +0 -1 -2
10

Min 5

0
Get the Most, from the Best!!

Alarm Sustain Period


Example: CPU Utilization >90% for 10 minutes

Cooldown Period
Upon executing a scale in/out, suspend further scaling activities for
a cooldown period of time.
(Used for simple scaling policies)
Example: Suspend scaling for 5 minutes

Instance Warmup Period


The number of seconds that it takes for a newly launched instance to
warm up. (Used for step scaling policies)
Example: Warm up takes 5 minutes

Thrashing occurs when your settings for scaling are causing the removal of capacity
followed by quickly re-adding capacity.

Alarms invoke actions for sustained state changes only. CloudWatch alarms will not
invoke actions simply because they are in a particular state, the state must have
changed and been maintained for a specified number of periods.
For more information about general Amazon CloudWatch concepts, see
http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/cloudwatc
h_concepts.html.

The Auto Scaling cooldown period is a configurable setting for your EC2 Auto Scaling
group that helps to ensure that EC2 Auto Scaling doesn't launch or terminate
additional instances before the previous scaling activity takes effect. After the EC2
Auto Scaling group dynamically scales using a simple scaling policy, EC2 Auto Scaling
waits for the cooldown period to complete before resuming scaling activities. When
you manually scale your EC2 Auto Scaling group, the default is not to wait for the
cooldown period, but you can override the default and honor the cooldown period.
Note that if an instance becomes unhealthy, EC2 Auto Scaling does not wait for the
cooldown period to complete before replacing the unhealthy instance. EC2 Auto
Scaling supports both default cooldown periods and scaling-specific cooldown
periods. Amazon EC2 Auto Scaling supports cooldown periods when using simple
scaling policies, but not when using target tracking policies, step scaling policies, or
scheduled scaling.

For more information about EC2 Auto Scaling cooldowns, see


http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/Cooldown.html

Instance Warmup
With step scaling policies, you can specify the number of seconds that it takes for a
newly launched instance to warm up. Until its specified warm-up time has expired, an
instance is not counted toward the aggregated metrics of the Auto Scaling group.
While scaling out, AWS also does not consider instances that are warming up as part
of the current capacity of the group. Therefore, multiple alarm breaches that fall in
the range of the same step adjustment result in a single scaling activity. This ensures
that we don't add more instances than you need.

For more information about EC2 Auto Scaling warmup, see


https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scaling-simple-
step.html#as-step-scaling-warmup
Get the Most, from the Best!!

Event Before Auto Scaling Action User/programmed Action After

1. Launch the
3. Act upon the instance
instance.
Scale Out (e.g. install software).
2. Send
4. Add it to the group.
notification.

1. Remove
instance from
the Auto Scaling 3. Act upon the instance
Scale In group. (e.g. retrieve logs)
2. Send 4. Terminate it.
notification.

In some cases, you may need to intervene before an EC2 Auto Scaling action adds to
or subtracts from your EC2 Auto Scaling group. EC2 Auto Scaling group lifecycle hooks
make this easy.

For example, during a scale-out event, EC2 Auto Scaling will launch an instance and
send out a pre-configured notification to a person or application and take no further
action. The receiver of the notification can then perform an action on the instance,
such as install software on it, before adding it to the EC2 Auto Scaling group.

Alternatively, during a scale-in event, EC2 Auto Scaling lifecycle hooks can be used to
remove the instance from service and again send out a notification. Upon receipt of
the notification, the agent can perform an action on the instance, such as retrieve
logs, before terminating the instance.

For more information about the EC2 Auto Scaling lifecycle, see
http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingGroupL
ifecycle.html.
Get the Most, from the Best!!
Get the Most, from the Best!!

 Used to distribute traffic across regions


◦ Can support high availability and lower latency
 A scalable Domain Name System (DNS) web service
◦ Register or transfer a domain name
◦ Resolves domain names to IP addresses
◦ Routes requests based on latency, health checks, weights, etc.

Elastic Load Balancing and Auto Scaling allow you to achieve highly flexible, scalable,
and resilient architectural designs. But what if you want to distribute traffic across
regions? There are various reasons you would want to distribute traffic across regions,
including disaster recovery (for widespread outages) and reduced latency (providing
services closer to where users are located).

Amazon Route 53 is a highly available and scalable cloud Domain Name System (DNS)
web service. It is designed to give developers and businesses an extremely reliable and
cost-effective way to route end users to internet applications by translating names like
www.example.com into the numeric IP addresses that computers use to connect to
each other, e.g., 192.0.2.1

Amazon Route 53 effectively connects user requests to infrastructure running in AWS,


such as Amazon EC2 instances, ELB load balancers, or Amazon S3 buckets, and can also
be used to route users to infrastructure outside of AWS. You can use Amazon Route 53
to configure DNS health checks to route traffic to healthy endpoints or to
independently monitor the health of your application and its endpoints. Amazon Route
53 makes it possible for you to manage traffic globally through a variety of routing
types, including latency-based routing, geo DNS, and weighted round robin—all of
which can be combined with DNS failover in order to enable a variety of low-latency,
fault-tolerant architectures. Amazon Route 53 also offers domain name registration;
you can purchase and manage domain names such as example.com, and Amazon
Route 53 will automatically configure DNS settings for your domains.
Get the Most, from the Best!!

 Default ELB DNS name


◦ AWS assigns a hostname to your load balancer that resolves to a set of
IP addresses
◦ Example:
web-app.us-west-2.elb.amazonaws.com
 Associate your own DNS name
◦ Assign your own hostname via an alias resource record set
◦ Create a CNAME record pointing to your load balancer
◦ Example:
example.com  ALIAS  web-app.us-west-2.elb.amazonaws.com
Get the Most, from the Best!!

O P
Amazon
Route 53

 Simple routing policy  Geolocation routing policy (DNS query


location)
 Weighted routing policy
 Latency routing policy  Geoproximity routing policy (Traffic
flow to AWS region or latitude or
 Failover routing policy longitude)
 Multivalue answer routing policy

Simple routing policy


Use for a single resource that performs a given function for your domain, for
example, a web server that serves content for the example.com website.

Weighted routing policy


Use to route traffic to multiple resources in proportions that you specify.

Latency routing policy


Use when you have resources in multiple AWS Regions and you want to route traffic
to the region that provides the best latency.

Failover routing policy


Use when you want to configure active-passive failover.

Geolocation routing policy


Use when you want to route traffic based on the location of your users.

Geoproximity routing policy


Use when you want to route traffic based on the location of your resources and,
optionally, shift traffic from resources in one location to resources in another.
Multivalue answer routing policy
Use when you want Route 53 to respond to DNS queries with up to eight healthy
records selected at random.
Get the Most, from the Best!!

User

some-elb-name.us-west-2.elb.amazonaws.com

some-elb-name.ap-southeast-2.elb.amazonaws.com

Assume that you want to distribute your architecture across several regions around the
world and provide the fastest response time. Often, but not always, the region closest
to the user provides the fastest response times.

You can use Amazon Route 53 to perform what is known as latency-based routing
(LBR), which allows you to use DNS to route user requests to the AWS Region that will
give your users the fastest response.

For example, assume that you have load balancers in the US West (Oregon) Region and
in the Asia Pacific (Sydney) Region, and you've created a latency resource record set in
Amazon Route 53 for each load balancer. A user in Barcelona enters the name of your
domain in a browser, and DNS routes the request to an Amazon Route 53 name server.
Amazon Route 53 refers to its data on latency between the different regions and routes
the request appropriately.

In most cases, this means your request is routed to the nearest geographical location:
Australia for a user in New Zealand, or Oregon for a user in Canada.
Note that you now have all of the components of a scalable architecture, which
provides you with resiliency and scalability at different levels:
• Auto Scaling provides scalability of resources across subnets and Availability
Zones within an Amazon VPC.
• An Elastic Load Balancing load balancer handles addressing and health checks
across one or more Auto Scaling groups, routing requests to healthy instances.
• Amazon Route 53 can route traffic to the closest Elastic Load Balancing load
balancer and re-routes traffic to a different Amazon VPC or an entirely separate
AWS Region in the event of a slowdown or catastrophe in another region.

For more information about LBR, see http://docs.aws.amazon.com/Route


53/latest/DeveloperGuide/routing-policy.html#routing-policy-latency.
Get the Most, from the Best!!

www.example.com Internet

Users

100
95
0 100
50

Load Load
Amazon Route 53
balancer balancer
weighted routing shifts New (Green)
Existing (Blue)
traffic from the old System
System
system to the new
system.

Here we see an example of a "blue/green" deployment. In this model, a parallel


environment—with its own load balancer and Auto Scaling configurations—is brought
up next to the existing environment. A feature of Amazon Route 53 called weighted
routing can be used to begin shifting users over from the existing (blue) environment
to the new (green) environment.

Technologies such as CloudWatch and CloudWatch Logs can be used to monitor the
green environment. If problems are found anywhere in the new environment,
weighted routing can be deployed to shift users back to the running blue servers.
When the new green environment is fully up and running without issues, the blue
environment can gradually be shut down. Due to the potential latency of DNS
records, a full shutdown of the blue environment can take anywhere from a day to a
week.
Get the Most, from the Best!!

 Find problems earlier rather than later; simulate traffic with


load testing
 Numerous open source tools are available:
◦ TheGrinder
◦ Apache Jmeter
◦ Bees with Machine Guns:
Amazon EC2-specific simple load testing script

Grinder: http://grinder.sourceforge.net/
Jmeter: http://jmeter.apache.org/
Bees with Machine Guns: https://github.com/newsapps/beeswithmachineguns
Get the Most, from the Best!!

 Configure your compute environment using:


◦ Elastic Load Balancing
◦ Amazon Route 53
 Create and scale your environment using Amazon EC2 Auto
Scaling

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy