0% found this document useful (0 votes)
25 views

AWS Game Tech - Intro Guide to Scalable Game Development

This guide provides an overview of scalable game development on Amazon Web Services (AWS), emphasizing the importance of a flexible and cost-effective cloud infrastructure for game studios of all sizes. It outlines key features and considerations for launching a game backend, including server architecture, scaling capabilities, and the use of various AWS services such as Elastic Beanstalk and Amazon EC2. The document also discusses the significance of player engagement features and analytics in modern game design.

Uploaded by

alex.unitydev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

AWS Game Tech - Intro Guide to Scalable Game Development

This guide provides an overview of scalable game development on Amazon Web Services (AWS), emphasizing the importance of a flexible and cost-effective cloud infrastructure for game studios of all sizes. It outlines key features and considerations for launching a game backend, including server architecture, scaling capabilities, and the use of various AWS services such as Elastic Beanstalk and Amazon EC2. The document also discusses the significance of player engagement features and analytics in modern game design.

Uploaded by

alex.unitydev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Development on AWS

Introductory Guide
to Scalable Game
INTRODUCTORY GUIDE TO SCAL ABLE GAME DEVELOPMENT ON AWS
2

No matter the size of your studio, there


are challenges involved with launching
a successful game—especially in the
current games landscape.

Not only do you want your game to be compelling, Amazon Web Services (AWS) is a flexible, cost-
but you also want to give players the wide range of effective, easy-to-use cloud service. By running
online features they expect. This includes friend lists, your game on AWS, you can use on-demand
leaderboards, weekly challenges, various multiplayer capacity to scale up and down with your players
modes, ongoing content releases, and more. instead of guessing at server demands and
potentially over or under purchasing hardware.
To successfully execute a game launch, you need to Some of the world’s leading mobile, AAA, and
create momentum. Favorable app store ratings and indie developers, including Rovio, Epic Games,
reviews on popular e-retail channels are critical for and Gearbox Software, have recognized the
promoting awareness and boosting sales—just like advantages of AWS and are successfully running
with the first weekend of a movie release. To increase their games on the AWS Cloud.
favorable ratings, it’s important to deliver features
that excite players. Supporting these features This guide is broken into sections that cover the
requires a server backend. The server backend can different features of modern games, including
consist of the actual game servers for multiplayer friend lists, leaderboards, game servers, messaging,
games or servers that power game services like chat and user-generated content. You can start small
and matchmaking. In the event that your game goes using the AWS components and services you
viral and suddenly explodes from 100 to 100,000 need. Then, you can revisit this guide to evaluate
players, you’ll need a server backend that can scale additional AWS features as your game evolves.
up at a moment’s notice. At the same time, you want
a cost-effective solution, so you don’t overpay for
unused server capacity.
3

Quick jump

1.0 Before you start 3.0 Scaling game servers on AWS 4.0 Scaling data storage for games on AWS 5.0 Scaling game services as

Game design decisions......................................................5 Games as REST APIs........................................................18 Relational vs. NoSQL databases....................................28 asynchronous jobs

Game client considerations..............................................6 HTTP load balancing........................................................19 MySQL.................................................................................29 Leaderboards and avatars...............................................48

Amazon EC2 Auto Scaling...............................................21 Amazon Aurora.................................................................31 Amazon SQS......................................................................49

2.0 Launching a game backend on AWS Game servers.....................................................................23 Redis...................................................................................32 FIFO queues.......................................................................50

Initial game backend..........................................................9 Matchmaking....................................................................24 MongoDB...........................................................................33 Other queue options........................................................50

High availability, scalability, and security....................12 Targeted and group messages.......................................25 Amazon DynamoDB.........................................................34

Binary game data with Amazon S3..............................13 Final thoughts on game servers....................................26 Other NoSQL options.......................................................37 6.0 Getting started

Customized AWS Elastic Beanstalk environment......13 Caching...............................................................................38

Reference architecture for a full game backend........14 Binary game content with Amazon S3.........................40

Content delivery and Amazon CloudFront..................41

Content upload with Amazon S3..................................42


1.0
4

Before you start


5
1.0 BEFORE YOU START

Game design decisions Each of these tenets have an


impact on your server features and
Modern social, mobile, and AAA games tend to share the following common tenets that affect server architecture
technology.

DEVICE-TO-DEVICE PLAY ANALYTICS ASYNCHRONOUS GAMEPLAY For example, if you have a simple top 10
Players expect their saved games, profiles, and Maximizing long-tail revenue requires games While larger games generally include a real-time leaderboard, you might be able to store
other data to be stored online, allowing them to collect and analyze a large number of metrics online multiplayer mode, developers of all kinds it in a single MySQL or Amazon Aurora
to easily move from device to device. This regarding gameplay patterns, favorite items, of games are realizing the importance of keeping database table. If, instead, you have complex
operation typically involves synchronizing and purchase preferences, and more. To ensure the players engaged with asynchronous features. leaderboards with multiple sort dimensions,
merging local data as you move from one device success of in-game purchases, it’s important An example of asynchronous play includes it might be necessary to use a NoSQL
to another, so a local data storage solution isn’t that new game features target those areas of competing against your friends by tracking technology, such as Amazon ElastiCache or
always the right fit. the game where players are spending their time points, unlocks, badges, or similar achievements. Amazon DynamoDB (which are discussed
and money. Analytics can also provide insights This gives players the feel of a connected game later in this guide).
LEADERBOARDS AND RANKINGS on how to improve gameplay and drive more experience, even if they aren’t online all the time
Players continue to look for a competitive player engagement. or are using slower networks (like 3G or 4G) for
experience similar to classic arcade games. mobile games.
However, the focus is increasingly on friends’ CONTENT UPDATES
leaderboards rather than just a single global high Games that achieve the highest player retention PUSH NOTIFICATIONS
score list. This requires a leaderboard that can tend to have a continuous release cycle of new A common method for bringing players back
sort in multiple dimensions while maintaining items, levels, challenges, and achievements. to a game is to send targeted push notifications
good performance. The trend of games becoming more of a service to their mobile devices. For example, a user
than a single product reinforces the need for might get a notification that their friend beat
FREE-TO-PLAY MODEL constant post-launch changes and frequent their score or that a new challenge or level is
One of the biggest shifts over the past few years updates with new data and game assets. available, drawing them back into the core
has been the widespread move to the free-to- However, it’s important to find a balance in how game experience.
play model. In this model, games are free to much new content you launch and when so as
download and play, and games earn money not to overwhelm players. You can cut costs and UNPREDICTABLE CLIENTS
through advertising and in-app purchases (IAP) increase download speed using a content delivery Modern games run on a wide variety of
for items such as weapons, outfits, power-ups, network (CDN) to distribute this game content. platforms, including mobile devices, consoles,
and boost points. Free-to-play games are funded PCs, and browsers. A player roaming on their
by a small group of players who purchase these SYNCHRONOUS GAMEPLAY portable device can compete against a console
items, while the vast majority of users play for Synchronous multiplayer features enable user on Wi-Fi, and both would expect a consistent
free. This means your game backend needs to players to enjoy real-time interactions with experience. That’s why it’s necessary to use
be as cost-effective as possible, and it must be other players. However, moderation is key here, stateless protocols (for example, HTTP) and
able to scale up and down as needed. Even for as real-time interactions require a constant asynchronous calls as much as possible.
premiere AAA games, a large percentage of connection to the server, which can affect your
revenue comes from content updates and in- game’s performance.
game purchases.
6
1.0 BEFORE YOU START

Game client considerations

This guide focuses on the architecture you can deploy ± Use JSON to transport data. It’s compact, cross- ± Never store security-critical data (like AWS access
on AWS. However, your game client implementation platform, fast to parse, has tons of library keys or other tokens) on the client device as
can also impact your game’s scalability. And because support, and contains data type information. If part of your game data or user data. Possessors
frequent network requests from the client use more you have large payloads, use gzip format, as the of access key IDs and secret access keys can make
bandwidth and require more server resources, it majority of web servers and mobile clients have direct HTTP calls using the APIs for individual
affects the cost of running your game backend. Here native support for gzip. Don’t waste time with over- AWS services or programmatic calls to AWS from
are a few important guidelines to follow: optimization—any payload in the range of hundreds the AWS Command Line Interface (AWS CLI), AWS
of kilobytes should be adequate. Developers also Tools for Windows PowerShell, or the AWS SDKs.
± All network calls should be asynchronous and use Apache Avro and MessagePack depending on If a person roots or jailbreaks their device, there’s a
non-blocking. This means that when a network their use case, comfort level with the formats, and risk that they can gain access to your server code,
request is initiated, the game client continues library availability. (Note: An exception to this is user data, and even your AWS billing account. With
on without waiting for a response from the multiplayer gameplay packets, which typically use PC games, your keys likely exist in memory when
server. When the server responds, this triggers UDP protocol, but this is a separate topic.) the game client is running. Pulling those keys
an event on the client, which is handled by a out isn’t difficult for someone with the technical
callback of some kind in the client code. On iOS, ± Use HTTP/1.1 with keepalives, and reuse HTTP know-how. It’s safe to assume that anything you
AFNetworking is one popular approach. Browser connections between requests. This minimizes the store on a game client will be compromised. If
games should use a call such as jQuery.ajax() or overhead your game incurs when making network you want your game client to directly access AWS
the equivalent. And C++ clients should consider requests. Each time you have to open a new HTTP services, consider using Amazon Cognito federated
libcurl, std::async, or similar libraries. Popular socket, it requires a TCP three-way handshake, which identities, which allows your application to obtain
game engines usually include an asynchronous can add upwards of 50 milliseconds. Additionally, temporary, limited-privilege credentials.
method for network and web requests. For repeatedly opening and closing TCP connections will
example, Unity offers UnityWebRequest and accumulate large numbers of sockets in the TIME_ ± As a precaution, never trust what a game client
Unreal Engine has HttpRequest. WAIT state on your server, which consumes valuable sends you. It’s an untrusted source, making it
server resources. important to always validate what you receive. It
can be something as trivial as a device clock set to a
± Always POST any sensitive data from the client past time—but it can also be malicious traffic, such
to the server over SSL. This includes login, stats, as SQL injection or XSS.
saved data, unlocks, and purchases. Because modern
computers are efficient at handling SSL and the
overhead is low, the same applies for any GET, PUT, Many of these concerns are not specific to AWS and
and DELETE requests. AWS offers Elastic Load are typical client/server safety issues, but keeping
Balancing to handle the SSL workload, completely them in mind will help you design a game that
offloading it from your servers. Multiplayer traffic is performs well and is reasonably secure.
generally transmitted over UDP, but it’s encrypted
and decrypted at each end by the developer.
2.0
7

Launching a game
backend on AWS
8
2.0 L AUNCHING A GAME ON AWS

To stay competitive, game


studios need to get to market
faster, keep costs low, and
continue to innovate gameplay.
The cloud supports these goals with on-demand services,
removing the need to maintain your own server hardware.
With AWS, studios of any size can scale and access tools that
were previously only available to larger companies.

Let’s look at a strategy for getting an initial game backend


up and running on AWS as quickly as possible. We’ll make
use of a few key AWS services, with the ability to add more
as our game evolves.
2.1
9

Initial game
backend
To ensure your game can scale Creating an HTTP/JSON API for the bulk of your game AWS Elastic Beanstalk is a deployment management
features allows you to dynamically add instances service that sits on top of other AWS services,
out as it grows in popularity,
and easily recover from transient network issues. including Amazon Elastic Compute Cloud (Amazon
use stateless protocols as much Our game backend (Figure 1) consists of a server EC2), Elastic Load Balancing, and Amazon Elastic Load
Balancing
as possible. that talks in HTTP/JSON, stores data in MySQL, and Relational Database Service (Amazon RDS). Amazon
uses Amazon Simple Storage Service (Amazon S3) EC2 is a web service that provides secure, resizable
for binary content. This type of backend is easy to compute capacity in the cloud. It’s designed to make
develop and scales effectively. at-scale cloud computing easier for developers. The
Amazon EC2 simple web service interface allows
A common pattern for game developers is to run you to obtain and configure computing capacity HTTP/JSON servers
a web server locally on a laptop or desktop for with minimal friction. It reduces the time required
development and push the server code to the cloud to obtain and boot new server instances to merely
when it’s time to deploy. This pattern is best suited minutes, allowing you to quickly scale up or down as
A B
for stateless workloads (such as leaderboards, player your computing requirements change.
data management, and more). It isn’t the best option
Availability Zone Availability Zone
for stateful workloads like game servers. If you follow
this pattern, AWS Elastic Beanstalk can significantly
simplify the process of deploying your code to AWS.

MySQL: game Amazon S3:


data/state binary content
Figure 1: A high-level
overview of an initial game
backend running on AWS
10
2.1 INITIAL GAME BACKEND

Elastic Load Balancing automatically distributes You can push a zip, web application resource (WAR), or
incoming application traffic across multiple Amazon git repository of server code to AWS Elastic Beanstalk,
EC2 instances. It enables you to achieve fault which launches Amazon EC2 server instances, attaches
tolerance in your applications. Elastic Load Balancing a load balancer, sets up Amazon CloudWatch
offers two types of load balancers that feature high monitoring alerts, and deploys your application
availability, automatic scaling, and robust security: to the cloud. In short, AWS Elastic Beanstalk can
automatically set up most of the architecture shown
CLASSIC LOAD BALANCER in Figure 1. This is covered in detail in the AWS Elastic
This load balancer routes traffic based on Beanstalk Developer Guide.
application- or network-level information. It’s
ideal for simple load balancing traffic across To see AWS Elastic Beanstalk in action, sign in to
multiple Amazon EC2 instances. the AWS Management Console and follow the
Getting started using Elastic Beanstalk tutorial to
APPLICATION LOAD BALANCER create a new environment with the programming
This load balancer routes traffic based on language of your choice. This will launch the sample
advanced application-level information that application and boot a default configuration. You can
includes the content of the request. It’s ideal use this environment to get a feel for the AWS Elastic
for applications that need advanced routing Beanstalk control panel, how to update code, and
capabilities, microservices, and container- how to modify environment settings. If you’re new to
based architectures. AWS, you can use the AWS Free Tier to set up these
sample environments. (Note: The sample production
Amazon RDS makes it easy to set up, operate, environment described in this guide will incur costs
and scale a relational database in the cloud. It because it includes AWS resources that aren’t covered
provides cost-efficient and resizable capacity while under the AWS Free Tier.)
automating time-consuming administration tasks,
such as hardware provisioning, database setup, With the sample application up, we can create a new
patching, and backups. Amazon RDS supports many AWS Elastic Beanstalk application for our game and
familiar database engines, including Amazon Aurora, two new environments, one for development and one
PostgreSQL, MySQL, and more. for production, that we’ll customize for our game. Use
the following table to determine which settings to
change based on the environment type. For detailed
instructions, see Managing and configuring Elastic
Beanstalk applications. Then, follow the instructions
for Creating an Elastic Beanstalk environment in the
AWS Elastic Beanstalk Developer Guide.
11
2.1 INITIAL GAME BACKEND

In the following table, replace My game and mygame values Important


with the name of your game. We don’t recommend using AWS Elastic Beanstalk to
manage your database in a production environment,
as this ties the lifecycle of the database instance
(DB instance) to the lifecycle of your application’s
Configuration setting Development value Production value environment.

Application name My game My game Instead, we recommend running a DB instance in


Environment name mygame-dev mygame-prod Amazon Aurora and configuring your application to
Instance type t2.micro M5.large connect to it on launch. You can also store connection
Create RDS DB instance Yes Yes information in Amazon S3 and configure AWS
Elastic Beanstalk to retrieve that information during
DB engine MySQL Not recommended
deployment with .ebextensions. You can add AWS
Instance class db.t2.micro N/A
Elastic Beanstalk configuration files (.ebextensions)
Allocated storage 5 GB N/A
to your web application's source code to configure
Deletion policy Delete Create snapshot your environment and customize the AWS resources
Multiple Availability Zones No Yes it contains. Configuration files are YAML-formatted
documents with a .config file extension that you
place in a folder named .ebextensions and deploy in
your application source bundle.

By using two environments, you can enable a simple, When your new game client is ready for release,
effective workflow. As you integrate new game choose the correct server code version from the For more information, see Advanced environment
backend features, you push your updated code to the development environment. Then, deploy it to the customization with configuration files
development environment. This triggers AWS Elastic production environment. By default, deployments (.ebextensions) in the AWS Elastic Beanstalk
Beanstalk to restart the environment and create a incur a brief period of downtime while your app is Developer Guide.
new version. In your game client code, create two being updated and restarted. To avoid downtime for
configurations, one that points to development and production deployments, you can follow a pattern
one that points to production. Use the development known as swapping URLs or blue/green deployment.
configuration to test your game, and use the In this pattern, you deploy to a standby production
production profile when you want to create a new environment and update DNS to point to the new
game version to publish to the appropriate app stores. environment. For more details on this approach, see
Blue/Green deployments with Elastic Beanstalk in
the AWS Elastic Beanstalk Developer Guide.
12
2.1 INITIAL GAME BACKEND

High availability, scalability, and security

For the production environment, you should ensure AWS Elastic Beanstalk can automatically deploy
that your game backend is deployed in a fault- across multiple Availability Zones. To use multiple
tolerant manner. Amazon EC2 is hosted in multiple Availability Zones with AWS Elastic Beanstalk, see
AWS Regions worldwide. Choose a Region that’s Auto Scaling group for your Elastic Beanstalk
near the bulk of your game’s customers to provide a environment in the AWS Elastic Beanstalk Developer
low-latency experience. For more information and a Guide. For additional scalability, you can use
list of the latest AWS Regions, see the AWS Global Auto Scaling to add and remove instances from
Infrastructure web page. these Availability Zones. For best results, consider
A B
modifying the Auto Scaling trigger to specify a
Within each Region are multiple, isolated locations metric (such as CPU usage) and threshold based Availability Availability
Zone Zone
known as Availability Zones, which you can think on your application’s performance profile. If the
of as logical data centers. Each of the Availability specified threshold is reached, AWS Elastic Beanstalk
Zones within a given Region is isolated physically but automatically launches additional instances.
connected via high-speed networking, so they can be
used together (Figure 2). A single Availability Zone is usually adequate for
development and test environments. This helps you
Balancing servers across two or more Availability keep costs low—assuming you can tolerate a bit of
Zones within a Region is a simple way to increase downtime in the event of a failure. However, if your
your game’s high availability. You can maintain a development environment is used by QA testers late
good balance between reliability and cost by pairing at night to validate builds, you probably want to treat
server instances, database instances, and cache this more like a production environment. In that case, C
instances together. you should use multiple Availability Zones.
Availability
Finally, set up the load balancer to handle Secure Zone

Sockets Layer (SSL) termination, so SSL encryption


and decryption is offloaded from your backend
servers. This is covered in Configuring HTTPS for Region
your Elastic Beanstalk environment in the AWS
Elastic Beanstalk Developer Guide. For security
reasons, we strongly recommend using SSL for
your game backend.

Figure 2: Diagram showing


three Availability Zones
connected within a Region
13
2.1 INITIAL GAME BACKEND

Binary game data with Amazon S3 Customized AWS Elastic


Beanstalk environment
The next step is to create an S3 bucket for each By default, S3 buckets are private and require users
AWS Elastic Beanstalk server environment that you to authenticate to download content for security
previously created. This S3 bucket stores your binary purposes. For game content, you have two options. As your game increases in popularity, your core game
game content, such as patches, levels, and assets. A better way to manage authentication is to use backend will need to scale and respond to demand.
Amazon S3 uses an HTTP-based API for uploading and signed URLs, which enables you to pass Amazon S3 Using HTTP for the bulk of your calls enables you
downloading data. This means your game client can credentials as part of the URL. In this scenario, your to easily scale up and down in response to changing
use the same HTTP library that's used to download game server code redirects users to an Amazon S3 usage patterns. Storing binary data in Amazon S3
game assets to talk to your game server. With Amazon signed URL that you can set to expire after a specified saves costs compared to serving files from Amazon
S3, you pay for the amount of data you store and period of time. For instructions on how to create EC2. And Amazon S3 takes care of data availability
the bandwidth for clients to download it. For more a signed URL, see Authenticating Requests (AWS and durability. Amazon RDS offers a managed MySQL
information, see Amazon S3 pricing. Signature Version 4) in the Amazon S3 API Reference. database that can grow over time with Amazon RDS
features, such as read replicas.
To get started, create an S3 bucket in the same Region If you’re using one of the official AWS SDKs with
as your servers. For example, if you deployed AWS your game server, there’s a good chance the SDK has If your game needs additional functionality, you can
Elastic Beanstalk to the US West (Oregon) Region, built-in methods for generating a presigned URL. A easily expand beyond AWS Elastic Beanstalk to other
choose this same Region for Amazon S3. For simplicity, presigned URL gives you access to the object identified AWS services without having to start over. To learn
and because S3 requires unique bucket names, use a in the URL, provided that the creator of the presigned how AWS Elastic Beanstalk supports configuring
similar naming convention for the bucket you used for URL has permissions to access that object. Because other AWS services, see Adding and customizing
your AWS Elastic Beanstalk environment (for example, generating a presigned URL is handled completely Elastic Beanstalk environment resources in the AWS
mygame-dev or mygame-prod) along with other offline (no API calls are involved), it’s a rapid operation. Elastic Beanstalk Developer Guide. For example, you
unique identification (for example, com.mycompany. can add a caching tier using Amazon ElastiCache,
mygame-dev). For step-by-step directions, see As your game grows, you can use Amazon CloudFront, a managed cache service that supports both
Creating a bucket in the Amazon S3 Developer Guide. a CDN, to provide better performance and save money Memcached and Redis. For details about adding an
Remember to create a separate S3 bucket for each on data transfer costs. For more information, see What ElastiCache cluster, see Example: ElastiCache in the
of your AWS Elastic Beanstalk environments (such as is Amazon CloudFront? in the Amazon CloudFront AWS Elastic Beanstalk Developer Guide.
development and production). Developer Guide.
Of course, you can always just launch other AWS
services yourself and configure your app to use those
services. For example, you can augment or even
replace your RDS MySQL DB instance with Amazon
Aurora Serverless (an on-demand, auto-scaling SQL
database) or Amazon DynamoDB (the AWS-managed
NoSQL offering). Even though we’re using AWS
Elastic Beanstalk to get started, you may choose to
access other AWS services as your game grows.
14
2.1 INITIAL GAME BACKEND

Reference architecture for a full game backend

With your core game backend up and running, the next step
Elastic Load
is to examine other AWS services that could be useful for your Balancing
game. Before continuing, let’s review the following reference Stateful TCP socket HTTP/S HTTP/S TCP
architecture for a horizontally scalable game backend (Figure
3). This diagram depicts a game backend that supports a
broad set of game features, including login, leaderboards, 4 CloudFront CDN

challenges, chat, binary game data, user-generated content,


analytics, and online multiplayer capabilities. Not all games Client
have all of these components, but this diagram shows how GET
2
they would all fit together. We’ll cover each component in
depth in the remaining sections of this guide.

Server
Stateful game servers HTTP/JSON servers Auto HTTP/JSON servers Auto Stateful game servers PUT
security group Scaling group Scaling group security group

Queue Mobile push Broadcast


Game data asynchronous notifications message for
job game
Reads Reads

6 8 Amazon S3 for
SQS for job binary game assets
CACHE CACHE queues
Writes SNS for push 3
messages

ElastiCache for Redis


security group
Writes
Job results
5 7
Run job
M S

R RDS MySQL R

Job workers Auto


Scaling group
1
Availability Zone A Availability Zone B

Single Region (Oregon, Singapore, etc.)


Figure 3: A fully production-ready
game backend running on AWS
15
2.1 INITIAL GAME BACKEND

Figure 3 may seem overwhelming at first, but it's really just an evolution of the initial game backend launched Look at a single Availability Zone
using AWS Elastic Beanstalk. This key explains the numbers in the diagram: in Figure 3 and compare it to the
core game backend we launched
The diagram shows two Availability Zones If your game has features that require stateful As your database load continues to grow, you can
1 4 7 with AWS Elastic Beanstalk.
that are set up with identical functionality for sockets, such as chat or multiplayer gameplay, add Amazon RDS read replicas to help scale out
redundancy. Due to space constraints, not all game servers typically run code specifically for your database reads even further. Because you
components are shown in both Availability those features. These servers run on Amazon can read from the replica and you only access You can see how scaling your game builds
Zones—but both would function the same way. EC2 instances separate from your HTTP the master database to write, this also helps on the initial backend pieces with the
These Availability Zones could be the same as instances. For more information on stateful reduce the load on your main database. For more addition of caching, database replicas,
the two Availability Zones you initially chose game servers, read the Scaling game servers information on read replicas, read the Relational and background jobs.
using AWS Elastic Beanstalk. on AWS section of this guide. vs. NoSQL databases section of this guide.

2 The HTTP/JSON servers and primary/standby 5 As your game grows and your database load 7B At some point, you may decide to introduce a
databases can be the same ones you launched increases, the next step is to add caching. This NoSQL service, such as Amazon DynamoDB, to
using AWS Elastic Beanstalk. Continue to build is typically done using Amazon ElastiCache, supplement your main database and support
out as much of your game functionality in the the AWS-managed caching service. Caching functionality (like leaderboards). Or, you may
HTTP/JSON layer as possible. You can use HTTP frequently accessed items in ElastiCache choose to take advantage of NoSQL features,
Auto Scaling to automatically add and remove offloads read queries from your database. For such as atomic counters. For more information
Amazon EC2 HTTP instances in response to more information on caching data, read the on NoSQL options, read the Relational vs.
user demand. For more information on HTTP Caching section of this guide. NoSQL databases section of this guide.
Auto Scaling, read the Games as REST APIs (Note: This isn’t shown in Figure 3.)
section of this guide. 6
The next step is to consider moving some of
your server tasks to asynchronous jobs. You can 8
If your game includes push notifications, you can
3
You can use the same S3 bucket you initially use Amazon Simple Queue Service (SQS) to use Amazon Simple Notification Service (SNS).
created for binary data. Amazon S3 is built coordinate this work. Amazon SQS eliminates Amazon SNS supports mobile push notifications
to be highly scalable and needs some tuning dependencies on the other components to simplify the process of sending push messages
over time. As your game assets and user traffic in a loosely coupled system. Two or more across multiple mobile platforms. Your Amazon
continue to expand, you can add Amazon components exist and interoperate to achieve EC2 instances can also receive Amazon SNS
CloudFront in front of S3 to boost download a specific purpose, each with little or no messages. This enables you to do things like
performance and save costs. knowledge of other participating components. broadcast messages to all players who are
For example, if your game allows players to currently connected to your game servers.
upload and share assets like photos or custom
characters, you should execute time-intensive
tasks, such as image resizing in a background
job. This will result in quicker response times
for your game while decreasing the load on
your HTTP server instances.
3.0
16

Scaling game
servers on AWS
17
3.0 SCALING GAME SERVERS ON AWS

Flexibility is key. And AWS gives


you that flexibility with game
server compute options, including
the ability to build your server,
integrate existing tools, and even
move to a fully managed service.
To give your players the best experience possible— Amazon GameLift includes Amazon GameLift
even during peak hours—you need computing FleetIQ, which provides a logic layer for using
resources that can quickly ramp up and down low-cost Spot Instances for game hosting
to accommodate fluctuating player usage. AWS while managing hosting tasks. If you want to
Game Tech offers a variety of global game server use your existing game server management
solutions to meet your computing needs. You can systems, Amazon GameLift FleetIQ can be used
run your own orchestration for game servers that independently of Amazon GameLift. Give players
use Amazon EC2 or Amazon EC2 Spot Instances. a great experience and save costs no matter which
You can deploy multiplayer game servers using compute option you choose.
container-based orchestration with managed
services, such as Amazon Elastic Container Service
(Amazon ECS) or Amazon Elastic Kubernetes
Service (Amazon EKS). Or, you can use a managed
service with Amazon GameLift, which helps you
deploy, operate, and scale dedicated game servers
for session-based multiplayer games.
3.1
18

Games as
REST APIs
To make use of horizontal Game clients—whether on mobile devices, tablets, doesn’t have these features, you can implement all This is just a sampling—you can build a REST API in
PCs, or consoles—send HTTP requests to your your functionality using a REST API. We’ll discuss any web-friendly programming language. Amazon
scalability, implement most of your
servers for data, such as logins, sessions, friends, stateful servers later in this guide. First, let’s focus on EC2 gives you complete root access to the instance,
game’s features using an HTTP/ leaderboards, and trophies. Clients don’t maintain our REST layer. so you can deploy any of these packages. There are
JSON API, which typically follows long-lived server connections. This makes it easy to some restrictions on supported packages for AWS
scale horizontally by adding HTTP server instances. Deploying a REST layer to Amazon EC2 typically Elastic Beanstalk. For details, see the AWS Elastic
the REST architectural pattern. Clients can recover from network issues by simply consists of an HTTP server, such as Nginx or Apache, Beanstalk FAQs.
retrying the HTTP request. plus a language-specific application server. The
following table lists some of the popular packages AWS API Gateway offers a solution service for
When properly designed, a REST API can scale to game developers use to build REST APIs: creating, publishing, maintaining, monitoring,
hundreds of thousands of concurrent players. RESTful and securing REST, HTTP, and WebSocket APIs
servers are simple to deploy on AWS. And they Language Package at any scale. WebSocket APIs enable real-time
benefit from the wide variety of HTTP development, communication between the server and client,
Node.js Express, Restify, Sails
debugging, and analysis tools available on AWS. making this an excellent choice for multiplayer
Python Eve, Flask, Bottle games. For more information on how to use AWS
Nevertheless, some modes of gameplay—like real- Java Spring, Jersey API Gateway to create WebSocket APIs, see our
time online multiplayer games, chat, and game Go Gorilla Mux, Gin Amazon API Gateway guide.
invites—benefit from a stateful two-way socket that PHP Slim, Silex
can receive server-initiated messages. If your game RESTful servers benefit from medium-sized
Ruby Rails, Sinatra, Grape
instances because more can be deployed
horizontally at the same price point. General-
purpose, medium-sized instances (for example, M5)
or compute-optimized instances (for example, C5)
are a good match for RESTful servers.
19
3.1 GAMES AS REST APIS

Follow these guidelines to get the most out of Elastic Load Balancing:

± Always configure Elastic Load Balancing to ± Each load balancer you deploy must have a
balance between at least two Availability Zones unique Domain Name System (DNS) name. To
for redundancy and fault tolerance. Elastic set up a custom DNS name for your game, you
Load Balancing balances traffic between the EC2 can use a DNS alias (CNAME) to point your game’s
instances in the Availability Zones you specify. domain name to the load balancer. For detailed
If you want an equal distribution of traffic on instructions, see Configure a custom domain
servers, enable cross-zone load balancing—even name for your Classic Load Balancer in the Classic
if there are an unequal number of servers per Load Balancers Guide. Note that when your load
Availability Zone. This ensures optimal usage of balancer scales up or down, the IP addresses that
servers in your fleet. the load balancer uses change. So, it’s important

HTTP load
to use a DNS CNAME alias and to avoid referencing
the load balancer’s current IP addresses in your
± Configure Elastic Load Balancing to handle SSL DNS domain.
encryption and decryption. This offloads SSL from

balancing
your HTTP servers, meaning there’s more CPU for For more information, see What is Elastic Load Balancing?
your application code. For more information, see
Create a Classic Load Balancer with an HTTPS
Listener in the Classic Load Balancers Guide.

Because HTTP connections are


stateless, load balancing RESTful ± Elastic Load Balancing automatically removes
any failed EC2 instances from the load balancing
servers is straightforward. pool. To ensure the health of your HTTP EC2
instances is closely monitored, configure your load
AWS offers Elastic Load Balancing, which is the balancer with a custom health check URL. Then,
easiest approach to HTTP load balancing for games write server code that responds to that URL and
on Amazon EC2. AWS Elastic Beanstalk automatically performs a check on your application’s health. For
deploys Elastic Load Balancing to load balance example, you can set up a simple health check
your Amazon EC2 instances. If you use AWS Elastic that verifies you have database connectivity. The
Beanstalk to get started, you'll already have Elastic health check returns the message “200 OK” if your
Load Balancing running. instance passes the check or “500 Server Error” if
your instance is unhealthy.
20
3.1 GAMES AS REST APIS

EXPLICIT SUPPORT FOR AMAZON ECS that contributes to network traffic. WebSocket is a
The Application Load Balancer can be great use case for delivering dynamic data (like updated
configured to load balance containers across leaderboards) while minimizing traffic and power usage
multiple ports on a single EC2 instance. on mobile devices. Elastic Load Balancing enables the
Dynamic ports can be specified in an ECS task support of WebSockets by changing the listener from
definition, giving the container an unused port HTTP to TCP. In TCP Mode, Elastic Load Balancing
when scheduled on EC2 instances. enables the Upgrade header when a connection is
established. Then, the load balancer terminates any
HTTP/2 SUPPORT connection that’s idle for more than 60 seconds
HTTP/2 (a revised edition of the older (for example, when a packet isn’t sent within that
HTTP/1.1 protocol), together with the timeframe). This means the client has to reestablish
Application Load Balancer, delivers additional the connection. WebSocket negotiations fail if the load
network performance as a binary protocol balancer sends an upgrade request and establishes a

Application
instead of a textual one. Binary protocols can WebSocket connection to other backend instances.
improve stability, as they’re inherently more
efficient to process and are much less prone
to errors than textual protocols. And HTTP/2 If you need specific features or metrics that Elastic Load

Load Balancer
supports multiplexing, which enables the Balancing doesn’t provide, you can deploy your own
reuse of TCP connections for downloading load balancer to Amazon EC2. Popular choices for games
content from multiple origins. It also cuts include HAProxy and F5’s BIG-IP Virtual Edition, both of
down on network overhead. which can run on EC2.

Our Application Load Balancer is a NATIVE IPV6 SUPPORT If you decide to deploy your own load balancer, keep in
second-generation load balancer that With the near exhaustion of IPv4 addresses, mind that there are several aspects you need to handle
many application providers are changing to a on your own. First, if your load surpasses what your
provides more granular control over model that rejects applications without IPv6 load balancer instances can handle, you need to launch
traffic routing based at the HTTP/ support. The Application Load Balancer natively additional EC2 instances. New auto-scaled application
supports IPv6 endpoints and routing to virtual instances aren’t automatically registered with your load
HTTPS layer. The following features
private cloud (VPC) IPv6 addresses. Many balancer instances. So, you need to write a script that
that come with the Application Load platforms require IPv6 as a failback option. updates the load balancer configuration files and restarts
Balancer can be highly beneficial for a the load balancers.
WEBSOCKETS SUPPORT
gaming workload: Like HTTP/2, the Application Load Balancer If you’re interested in HAProxy as a managed service,
supports WebSocket protocol, enabling you to consider AWS OpsWorks, which uses Chef Automate to
set up a longstanding TCP connection between manage EC2 instances and can deploy HAProxy as an
a client and server. This is a much more efficient alternative to Elastic Load Balancing.
method than standard HTTP connections that
are usually held open with a sort of heartbeat
21
3.1 GAMES AS REST APIS

Auto Scaling enables you to scale the number of To use Auto Scaling effectively, choose good metrics
EC2 instances in one or more Availability Zones to trigger scale-up and scale-down activities. Use the
based on system metrics like CPU utilization or following guidelines to determine your metrics:
network throughput.For an overview of Auto Scaling
functionality, see What is Amazon EC2 Auto Scaling?
in the Amazon EC2 User Guide. Then, walk through MONITOR CPUUTILIZATION
Getting started with Amazon EC2 Auto Scaling. This is a good Amazon CloudWatch metric. Web
servers tend to be CPU limited, whereas memory
You can use Auto Scaling with any type of EC2 remains fairly constant when the server processes are
instance, including HTTP, a game server, or a running. A higher percentage of CPUUtilization tends
background worker. HTTP servers are the easiest to to indicate the server is becoming overloaded with
scale because they sit behind a load balancer that requests. For finer granularity, pair CPUUtilization
distributes requests across server instances. Auto with NetworkIn or NetworkOut.

Amazon EC2
Scaling dynamically handles the registration or
deregistration of HTTP-based instances from Elastic BENCHMARK YOUR SERVERS
Load Balancing. This means traffic will be routed to a This helps you determine good values to scale on.
new instance as soon as it’s available. For HTTP servers, you can use a tool like Apache

Auto Scaling
HTTP server benchmarking tool or httperf to
measure server response times. Increase the load on
your servers while monitoring CPU or other metrics.
Then, make note of the point at which your server
response times degrade, and see how it correlates to
The ability to dynamically grow and shrink your system metrics.
server resources in response to user patterns
USE TWO AVAILABILITY ZONES
is a primary benefit of running on AWS. You should also choose a minimum of two servers
when configuring your Auto Scaling group. This
will ensure your game server instances are properly
distributed across multiple Availability Zones for
high availability. Elastic Load Balancing takes care of
balancing the load between multiple Availability Zones.

For details on configuring scaling policies, see Dynamic


scaling for Amazon EC2 Auto Scaling in the Amazon EC2
Auto Scaling User Guide.
22
3.1 GAMES AS REST APIS

Installing application code

When you use Auto Scaling with AWS Elastic Beanstalk, it takes care of installing
your application code on new EC2 instances as they scale up. This is one of the
advantages of the managed container that AWS Elastic Beanstalk provides. However,
this approach is only for application servers, not game servers.

If you’re using Auto Scaling without AWS Elastic Beanstalk, you need to get your
application code onto your EC2 instances to implement automatic scaling. If you’re
not already using Chef or Puppet, consider using one of these tools to deploy
application code on your instances. AWS OpsWorks Auto Scaling uses Chef to
configure instances and offers a variant of Auto Scaling that provides both time-
based and load-based automatic scaling. With AWS OpsWorks, you can set up
custom start-up and shut-down steps for your instances as they scale. This is a
great alternative to managing automatic scaling when you’re already using Chef
or if you’re interested in using Chef to manage your AWS resources. For more
information, see Managing Load with Time-based and Load-based Instances in the
AWS OpsWorks User Guide.

If you’re not using any of these packages, you can use the Ubuntu cloud-init
package as a simple way to pass shell commands directly to EC2 instances. With
cloud-init, you can run a simple shell script that fetches the latest application code
and starts up the appropriate services. This solution is supported by the official
Amazon Linux AMI and the Canonical Ubuntu AMIs. For more details on these
approaches, see the AWS Architecture Center.
3.2
23

Game
servers
There are some gameplay However, sometimes a game server’s approach needs The following table lists several packages that allow C++ isn't listed in the table you see here
to be the opposite of a RESTful approach. Clients you to build event-driven servers: because it tends to be the language of choice
scenarios that work well with
establish a stateful two-way connection to the game for multiplayer game servers. Many commercial
an event-driven RESTful model. server via UDP, TCP, or WebSockets—enabling both game engines, such as Amazon Lumberyard
Language Package
For example, turn-based play the client and server to initiate messages. If the and Unreal Engine, are written in C++. This
network connection is interrupted, the client must Python Givent, Twisted enables you to take existing game code from the
and appointment games that perform reconnect logic and possibly logic to reset client and reuse it on the server. It’s particularly
Node.js Core, Socket.io, Async
don't require constant real-time its state. Because clients can’t simply be round-robin
Erlang Core
valuable when running physics or other
load balanced across a pool of servers, stateful game frameworks on the server, such as Havok, which
updates can be built as stateless servers introduce challenges for automatic scaling.
Java JBoss, Netty
frequently only support C++.
game servers using the techniques Ruby Event Machine

Historically, many games have used stateful Go Socket.io Regardless of programming language, stateful
mentioned in the previous section.
connections and long-running server processes for socket servers generally benefit from as large
game functionality, especially in the case of larger an instance as possible because they’re more
AAA and massively multiplayer online (MMO) games. sensitive to issues like network latency. Consider
If you have a game that’s architected in this manner, your game server’s bandwidth requirements
you can run it on AWS. For new games, however, we when determining the best Amazon EC2
encourage you to use HTTP as much as possible for instance type. The largest instances in the
stateless functions. And we only recommend using C2 compute-optimized instance family (for
stateful sockets (like UDP) for aspects of your game example, C5) are often the best options. This
that really need it, such as online multiplayer. new generation of instances uses enhanced
networking via single-root I/O virtualization (SR-
IOV). SR-IOV provides high packets per second,
low latency, and low jitter—making this an ideal
solution for game servers.
24
3.2 GAME SERVERS

The following steps outline the typical matchmaking process:

1. Ask the user about the type of game they would In this approach, game clients first connect to your
like to join (one-on-one or teams, for example). REST API and request a stateful game server. Next,
2. Look at what game modes are currently being the REST API performs matchmaking logic and gives
played online. clients an IP address and server port to connect to.
The game client then connects directly to that game
3. Factor in variables like the user’s geolocation
server’s IP address. This hybrid approach gives you
(for latency) or ping time, language, and
the best performance for your socket servers because
overall ranking.
clients can directly connect to the EC2 instances. And
4. Place the user on a game server that contains you still get the benefits of using HTTP-based calls
a matching game. for your main entry point.

Game servers require long-lived processes, and they For most matchmaking needs, Amazon GameLift
can't be round-robin load balanced like with an HTTP provides a matchmaking system called FlexMatch.
request. After a player is on a given server, they You can control GameLift FlexMatch via your REST

Matchmaking
remain on that server until the game is over, which API and make calls to the Amazon GameLift API to
could be minutes or hours. initiate matching and return results. You can learn
more about FlexMatch in the Amazon GameLift
In a modern cloud architecture, you should minimize Developer Guide. If this solution doesn't suit your
your usage of long-running game server processes to matchmaking needs, you can find more information
Matchmaking is a feature that draws players in. the gameplay elements that require it. For example, about implementing matchmaking in a serverless
imagine an open-world or MMO game. Some of the custom environment in Fitting the Pattern:
functionality, such as running around the world and Serverless Custom Matchmaking with Amazon
interacting with other players, requires long-running GameLift on the Amazon Game Tech Blog.
game server processes. However, the rest of the API
operations, like listing friends, altering inventory,
updating stats, and finding games to play, can be
easily mapped to a REST web API.

See how Metalhead Software uses AWS to


power its matchmaking systems.
25
3.2 GAME SERVERS

Amazon SNS can help route messages between EC2 Amazon SQS and Amazon SNS are AWS messaging
server instances. For example, let’s assume player 1 services that provide different benefits to developers.
on server A wants to send a message to player 2 on A common pattern is to use Amazon SNS to publish
server C (Figure 4). In this scenario, server A can look messages to Amazon SQS queues to reliably and
at locally connected players. When server A can’t find asynchronously send messages to one or many
player 2, it can forward the message to an SNS topic system components. See the Amazon SQS section
to propagate the message to other servers. later in this guide to learn more about Amazon SQS
use cases for games.

Targeted In this scenario, Amazon SNS fills a role similar to a


message queue like RabbitMQ or Apache ActiveMQ.
Instead of using Amazon SNS, you can run RabbitMQ,
Mobile push notifications

and group
Apache ActiveMQ, or a similar package on Amazon
EC2. The advantage of Amazon SNS is that you don’t
Unlike the previous use case, which is designed to
have to spend time administering and maintaining
handle near-real-time in-game messaging, mobile push
queue servers and software on your own. For more
is the best choice for sending a message to draw a user

messages
information about Amazon SNS, see What is Amazon
back in when they’re out of a game. An example might
SNS? and Creating an Amazon SNS topic in the
be a user-specific event (such as a friend beating your
Amazon Simple Notification Service Developer Guide.
high score) or a broader game event (like a Double-XP
Weekend).

There are two main categories of


Although Amazon SNS supports the ability to send push
messages in gaming: messages targeted notifications directly to mobile clients, Amazon Pinpoint
SNS topic
at a specific user (like private chat or between servers
is a better option. Amazon Pinpoint provides more than
just mobile push notifications. It’s a player-pleasing,
trade requests) and group messages multi-channel notification solution that includes email,
(such as chat or gameplay packets). voice messages, and SNS messaging.

A common strategy for sending and receiving


messages is to use a socket server with a stateful A B C Figure 4: SNS-backed player-
to-player communication
connection. If your player base is small enough between two servers
that everyone can connect to a single server, you Socket server
instances
can route messages between players by selecting
different sockets. However, in most cases, you need
multiple servers—meaning those servers also need
a way to route messages.

Player 1 Player 2
26
3.2 GAME SERVERS

Final thoughts on game servers

It’s easy to become obsessed with finding the perfect programming framework or
pattern. Both RESTful and stateful game servers have their place. And any of the
languages discussed in this guide work well when programmed thoughtfully. When
making your choice, consider your overall game data architecture—where data lives,
how to query it, and how to efficiently update it.
4.0
27

Scaling data storage


for games on AWS
Supporting a global audience of millions of
online players with request rates that easily reach
millions per second means you need the ability to
accommodate significant spikes in traffic.
4.1
28

Relational vs.
NoSQL databases
With modern game applications It’s important to spend time thinking about your There are many database options out there for both
overall game data architecture—where data lives, relational and NoSQL flavors, but the ones used most
that scale horizontally and
how to query it, and how to efficiently update it. A frequently for games on AWS are Amazon Aurora,
globally with your players, the number of new databases have become popular that Amazon ElastiCache for Redis, Amazon DynamoDB,
traditional approach of using a eschew traditional atomicity, consistency, isolation, Amazon RDS for MySQL, and Amazon DocumentDB
and durability (ACID) concepts in favor of lightweight (with MongoDB capability).
single, large relational database access, distributed storage, and eventual consistency.
becomes less tenable. These NoSQL databases can be especially beneficial First, we’ll cover MySQL because it’s both popular
for games, where data structures tend to be lists and and applicable to gaming. Combinations such as
sets—like friends, levels, and items—as opposed to MySQL and Redis or MySQL and Amazon DynamoDB
complex relational data. are especially successful on AWS. All database
alternatives described in this section support atomic
As a general rule, the biggest bottleneck for online operations, such as increment and decrement, which
games tends to be database performance. A typical are crucial for gaming.
web-based app has a high number of reads and few
writes—think of reading blogs, watching videos, and
so forth. Games are quite the opposite, with reads
and writes frequently hitting the database due to
constant state changes in the game.
29
4.1 REL ATIONAL VS. NOSQL DATABASES

As an ACID-compliant relational database, MySQL has the following advantages:

TR ANSACTIONS These advantages continue to make MySQL


MySQL provides support for grouping attractive, especially for aspects of gaming like
multiple changes into a single atomic account records, IAPs, and similar functionality
transaction that must be committed or where transactions and data consistency are
rolled back. NoSQL stores typically lack paramount. Even gaming companies using NoSQL
multi-step transactional functionality. offerings, such as Redis and Amazon DynamoDB,
frequently put transactional data (like accounts
ADVANCED QUERYING and purchases) in MySQL.
MySQL speaks SQL, which provides the
flexibility to perform complex queries that If you’re using MySQL on AWS, we recommend
evolve over time. NoSQL databases typically choosing Amazon RDS to host MySQL. This can save
only support key-value access or access by a you valuable deployment and support cycles. Amazon
single secondary index, meaning you must RDS for MySQL automates the time-consuming
make careful data design decisions up front. aspects of database management, like launching

MySQL
Amazon EC2 instances, configuring MySQL, attaching
SINGLE SOURCE OF TRUTH Amazon Elastic Block Store (Amazon EBS) volumes,
MySQL guarantees internal data consistency. setting up replication, running nightly backups, and
Part of what makes many NoSQL solutions so on. In addition, Amazon RDS offers advanced
faster is distributed storage and eventual features, including synchronous Multi-AZ replication
MySQL is the most widely adopted consistency. Eventual consistency means you for high availability, automated primary/standby
can write a key on one node, fetch that key failover, and read replicas for increased performance.
open-source relational database.
on another node, and have it not appear To get started with Amazon RDS, see Getting Started
there immediately. with RDS in the Amazon RDS User Guide.
With more than 20 years of community-backed
development and support, MySQL is a reliable, stable, EXTENSIVE TOOLS
and secure SQL-based database management system. MySQL has been around since the 1990s,
and there are extensive debugging and
data analysis tools available for it. In addition,
SQL is a general-purpose language that’s
widely understood.
30
4.1 REL ATIONAL VS. NOSQL DATABASES

The following are some configuration options that we recommend you As your game grows and your write
implement when you create your RDS MySQL DB instances: load increases, resize your RDS DB
instances to scale up.

DB INSTANCE CLASS ALLOCATED STORAGE SLOW SQL QUERIES Resizing an RDS DB instance requires some
downtime. However, if you deploy the
± You should use a micro instance for ± We recommend 5 GB of storage in To find and analyze slow SQL queries in instance in Multi-AZ mode (as you would
development/test environments development/test environments and 100 production, you should enable the MySQL slow for production), downtime is limited to the
and medium or larger instances for GB minimum in production environments query log in Amazon RDS (as shown in the time it takes to initiate a failover (typically
production environments. to enable provisioned IOPS. following list). These settings are configured a few minutes). For more information, see
using Amazon RDS DB parameter groups. Modifying an Amazon RDS DB Instance in
(Note: There’s a minor performance penalty the Amazon RDS User Guide. In addition,
for the slow query log.) you can add one or more Amazon RDS
read replicas to offload reads from your
MULTI-AZ DEPLOYMENT PROVISIONED IOPS ± Set SLOW_QUERY_LOG = 1 to enable. In primary RDS instance, leaving more cycles
Amazon RDS, slow queries are written to the for database writes. For instructions on
± This isn’t needed for development/test ± This is recommended for production MYSQL.SLOW_LOG table. deploying replicas with Amazon RDS, see
environments, but it’s recommended environments. Provisioned IOPS Working with Read Replicas.
for production environments to enable guarantees a certain level of disk ± Consider decreasing the default LONG_
synchronous Multi-AZ replication and performance, which is important for large QUERY_TIME value to 5, 3, or even 1. (The
failover. For best performance, always write loads. For more information, see default is 10.) The value set in LONG_
launch production on an RDS DB instance Provisioned IOPS Storage in the Amazon QUERY_TIME determines that only queries
that’s separate from any of your Amazon RDS User Guide. that take longer than the specified number
RDS development/test DB instances. of seconds are included.

± Make sure to periodically rotate the slow


query log as described in Common DBA
Tasks for MySQL DB Instances in the
AUTO MINOR VERSION UPGRADE BACKUP SNAPSHOTS Amazon RDS User Guide.

± This is recommended for hands- ± It’s best to schedule Amazon RDS backup
off upgrades. snapshots and upgrades during your
low player count times, such as early in
the morning. If possible, avoid running
background jobs or nightly reports during
this window to prevent a query backlog.
31
4.1 REL ATIONAL VS. NOSQL DATABASES

There are several key features that Amazon Aurora The following are recommendations for using
Amazon Aurora in your gaming workload:
brings to a gaming workload:

HIGH PERFORMANCE Use t2.small DB instance in your development/test


Amazon Aurora is designed to provide up to five times the throughput environments and r3.large or larger instance in your
of standard MySQL running on the same hardware. This performance production environment.
is on par with commercial databases but at a significantly lower cost.
On the largest Amazon Aurora instances, it’s possible to provide up to Deploy read replicas in at least one additional
500,000 reads and 100,000 writes per second with 10 milliseconds of Availability Zone to provide failover and read
latency between read replicas. operation offloading.

DATA DURABILITY Schedule Amazon RDS backup snapshots and

Amazon
In Amazon Aurora, each 10 GB chunk of your database volume is upgrades during low player count times. If possible,
replicated six ways across three Availability Zones. This allows for the avoid running jobs or reports against the database
loss of two copies of data without affecting database write availability during this window to prevent backlogging.
and three copies without affecting read availability. Backups to

Aurora
Amazon S3 are automatic and continuous, offering 99.999999999 If your game grows beyond the bounds of a
percent durability with a retention period of up to 35 days. You can traditional relational database, like MySQL or
restore your database to any second (up to the last five minutes) Amazon Aurora, we recommend that you complete a
during the retention period. performance evaluation, including tuning parameters
and sharding. And consider a NoSQL offering, such
Amazon Aurora is a MySQL-compatible SCALABILITY as Redis or Amazon DynamoDB, to offload some
Amazon Aurora is capable of automatically scaling its storage workloads from MySQL. In the following sections,
relational database engine that
subsystem out to 64 TB of storage. This storage is automatically we’ll cover a few popular NoSQL offerings.
combines the speed and availability of provisioned for you, so you don’t have to provision storage ahead
high-end commercial databases with of time. This means you pay only for what you use, reducing the
costs of scaling. Amazon Aurora can deploy up to 15 read replicas in
the simplicity and cost-effectiveness of any combination of Availability Zones, including cross-region where
open-source databases. Amazon Aurora is available. This allows for seamless failover in case of
an instance failure.

See how The Pokémon Company


international uses Aurora and experiences
reduced downtime.
32
4.1 REL ATIONAL VS. NOSQL DATABASES

Redis provides foundational data types—including Benefits of Redis


counters, lists, sets, and hashes—which are accessed
using a high-speed, text-based protocol. For more
± In-memory data store
details, see Redis data types documentation and An
introduction to Redis data types and abstractions. ± Flexible data structures
These unique data types make Redis an ideal choice
for leaderboards, game lists, player counts, stats, ± Simplicity and ease of use
inventories, and similar data. Redis keeps its entire
± Replication and persistence
data set in memory, so access is extremely fast. For
comparisons with Memcached, check out the Redis ± High availability and scalability
benchmarks in How fast is Redis?
± Extensibility
There are a few caveats concerning Redis that you
should be aware of. First, you need a large amount of
physical memory because the entire dataset is memory
resident (that is, there’s no virtual memory support).
Replication support is also simplistic, and debugging

Redis
tools for Redis are limited. Redis isn’t suitable as your
only data store. However, when used in conjunction
with a disk-backed database (such as MySQL or
Amazon DynamoDB) Redis can provide a highly
scalable solution for game data. Redis plus MySQL is a
Best described as an atomic data popular solution for gaming.

structure server, Redis has unique


Redis uses minimal CPU but a lot of memory. That’s
features not found in other databases. why it’s best suited for high-memory instances, such
as the Amazon EC2 memory-optimized instance family
(that is, R3). AWS offers a fully managed Redis service,
Amazon ElastiCache for Redis. Amazon ElastiCache
for Redis handles clustering, primary/standby
replication, backups, and many other common Redis
maintenance tasks.

For a deep dive on how to get the most out of


ElastiCache, see the Performance at Scale with
Amazon ElastiCache whitepaper.
33
4.1 REL ATIONAL VS. NOSQL DATABASES

For communication, MongoDB uses a binary variant of


JSON called Binary JSON (BSON). Programming against it
is a matter of storing and retrieving JSON structures. This
has made MongoDB a popular choice for games and web
applications because server APIs are usually JSON too.

MongoDB also offers a number of interesting hybrid features,


including SQL-like syntax that enables you to query data by
range and composite conditions. Similar to Redis support,
MongoDB supports atomic operations, such as increment/
decrement and add/remove from list. For examples of these
operations, see the MongoDB findAndModify documentation.

MongoDB is widely used as a primary data store for games


and is frequently used in conjunction with Redis because
the two complement each other well. Transient game data,
sessions, leaderboards, and counters are kept in Redis.

MongoDB
Progress is saved to MongoDB at logical points (for example,
at the end of a level or when a new achievement is unlocked).
Redis yields high-speed access for latency-sensitive game
data, and MongoDB provides simplified persistence.

MongoDB is a document-oriented MongoDB supports native replication and sharding as well,


although you have to configure and monitor these features
database. This means data is stored
yourself. For an in-depth look at deploying MongoDB on AWS,
in a nested data structure similar to see our MongoDB on AWS whitepaper.
a structure you would use in a typical
Amazon DocumentDB (with MongoDB compatibility) is
programming language. a fully managed document database service that supports
MongoDB workloads. It’s designed for high availability and
performance at scale and is highly secure.
34
4.1 REL ATIONAL VS. NOSQL DATABASES

Amazon DynamoDB manages tasks like synchronous


replication and I/O provisioning, automatic scaling,
and managed caching. Amazon DynamoDB uses a
provisioned throughput model in which you specify
the number of reads and writes you want per second.
The rest is handled for you under the hood. To set
up Amazon DynamoDB, see Getting Started with
DynamoDB in the Amazon DynamoDB Developer Guide.

Games frequently use Amazon DynamoDB features


in the following ways:

Amazon Key-value store for user data, items, friends,


and history

DynamoDB
Range key store for leaderboards, scores, and
date-ordered data

Atomic counters for game status, user counts,


and matchmaking
Amazon DynamoDB is a fully managed
NoSQL solution provided by AWS.
Like MongoDB and MySQL, Amazon DynamoDB can
be paired with a technology such as Redis to handle
real-time sorting and atomic operations. Many game
developers find Amazon DynamoDB to be sufficient
on its own, but you still have the flexibility to add
Redis or a caching layer to a DynamoDB-based
architecture. Let’s review our reference diagram
with Amazon DynamoDB to see how it simplifies the
architecture (Figure 5).

See how CAPCOM uses DynamoDB.


35
4.1 REL ATIONAL VS. NOSQL DATABASES

Table structure and queries


Elastic Load
Balancing
Stateful TCP socket HTTP/S HTTP/S TCP
Amazon DynamoDB, like MongoDB, is a loosely
structured NoSQL data store that allows you to save
different sets of attributes on a per-record basis. All you
CloudFront CDN need to do is predefine the primary key strategy you
want to use:

PARTITION KEY
The partition key is a single attribute that
Amazon DynamoDB uses as input to an internal
hash function. This could be a player name,
Stateful game servers HTTP/JSON servers HTTP/JSON servers Stateful game servers game ID, UUID, or similar unique key. Amazon
security group Auto Scaling group Auto Scaling group security group DynamoDB builds an unordered hash index on
this key.

Queue Mobile push Broadcast PARTITION KEY AND SORT KEY


Game data asynchronous notifications message for
Referred to as a composite primary key, this
job game
Reads Reads type of key is composed of two attributes:
the partition key and the sort key. Amazon
Amazon S3
SQS for job for binary
DynamoDB uses the partition key value as input
Writes
queues game assets to an internal hash function, and all items with
CACHE CACHE
SNS for push the same partition key are stored together and
messages sorted by sort key value. For example, you can
Push message
store game history as a duplet of [user_id, last_
ElastiCache for Redis
security group login]. Amazon DynamoDB builds an unordered
Job results hash index on the partition key attribute and
a sorted range index on the sort key attribute.
Writes Run job Only the combination of both keys is unique in
this scenario.

For best querying performance, maintain each Amazon


DynamoDB
Job workers Auto DynamoDB table at a manageable size. For example,
Scaling group
if you have multiple game modes, it’s better to have
a separate leaderboard table for each game mode
Availability Zone A Availability Zone B
instead of one giant table. This gives you the flexibility
to scale your leaderboards separately in the event that
one game mode is more popular than the others.
Figure 5: A production-ready game backend
Single Region (Oregon, Singapore, etc.) running on AWS using Amazon DynamoDB
36
4.1 REL ATIONAL VS. NOSQL DATABASES

Provisioned throughput Amazon DynamoDB on-demand

When you’re not sure how much read and write capacity
Amazon DynamoDB shards your data behind the To get the best performance from Amazon you’re going to need for your DynamoDB table, you
scenes to give you the throughput you requested DynamoDB, make sure your reads and writes are can use Amazon DynamoDB on-demand. You can pay
using the concept of read and write units. One read spread as evenly as possible across your keys. Using a for what you end up using while explicitly setting the
capacity unit represents one strongly consistent hexadecimal string, such as a hash key or checksum, capacity you’ll use. So, when you’re first launching your
read per second (or two eventually consistent reads is one easy strategy to inject randomness. new game or new content, you can absorb much of
per second) for an item up to 4 KB in size. One write the unpredictable nature of player activity without the
capacity unit represents one write per second for an risk of limited capacity or slow Auto Scaling responses
For more details on optimizing Amazon DynamoDB
item up to 1 KB in size. The defaults are five read (or even wasted, over-provisioned capacity). Once you
performance, see Best Practices for Designing
and five write units, equating to 20 KB of strongly know your players' read and write patterns, you can
and Architecting with DynamoDB in the Amazon
consistent reads per second and 5 KB of writes per always switch back to provisioned capacity during
DynamoDB Developer Guide.
second. You can increase your read and/or write normal operations to save costs and switch again to the
capacity at any time and by any amount up to your on-demand option for event releases.
account limits. You can also decrease the read and/or
write capacity by any amount, but this can’t exceed
more than four decreases in one day. You can scale
using the AWS Management Console or Amazon CLI
by selecting the table and modifying it appropriately. ! Amazon DynamoDB Accelerator
You can also take advantage of Amazon DynamoDB 3

Auto Scaling service to dynamically adjust Amazon DynamoDB Accelerator (DAX) allows you to
Amazon SNS
provisioned throughput capacity on your behalf Amazon provision a fully managed, in-memory cache that speeds
in response to actual traffic patterns. Amazon CloudWatch up the responsiveness of your Amazon DynamoDB
DynamoDB Auto Scaling works in conjunction 2 tables from millisecond-scale latency to microseconds.
with Amazon CloudWatch alarms that monitor the 1 This acceleration comes without the need for any major
4
capacity units (Figure 6). This service scales according changes in your game code, which simplifies deployment.
to your defined rules. All you need to do is reinitialize your Amazon DynamoDB
client with a new endpoint that points to DAX, and the
There’s a delay before the new provisioned rest of the code can remain untouched. DAX handles
6 Update table 5
throughput is available while data is repartitioned in cache invalidation and data population without your
the background. This doesn’t cause downtime, but intervention. This cache can help speed responsiveness
DynamoDB Application
it does mean that Amazon DynamoDB Auto Scaling when running events that might cause a spike in players,
table Auto Scaling
is best suited for changes over time, such as the such as a seasonal downloadable content (DLC) offering
number of players increasing from 1,000 to 10,000. or a new patch release.
It’s not designed to handle hourly user spikes. For
this, as with other databases, you need some form of
caching to add resiliency.
Figure 6: A high-level overview of how Amazon DynamoDB
Auto Scaling manages throughput capacity for a table
37
4.1 REL ATIONAL VS. NOSQL DATABASES

As with choosing a server programming language, there’s no perfect


database—you need to weigh the pros and cons of each one.

RIAK
Riak KV is a flexible key-value data
model for web scale profile and session
management, real-time big data, data
cataloging, content management,
360-degree customer data management,
digital messaging, and more.

COUCHBASE

Other NoSQL
Couchbase Cloud is a fully managed,
automated database that simplifies
database management for deploying,
managing, and operating Couchbase

options
Server across multi-cloud environments.

CASSANDRA
Apache Cassandra is an open-source,
distributed, NoSQL database that presents
There are a number of other a partitioned wide-column storage model
with eventually consistent semantics.
NoSQL alternatives you can
use for gaming, including Riak,
Couchbase, and Cassandra.

See how Directive Games uses AWS services


to reduce costs and guarantee security.
38
4.1 REL ATIONAL VS. NOSQL DATABASES

Even a short-lived cache—with just a few seconds caching strategy because it only populates the cache
for data such as leaderboards, friend lists, and recent when a client requests the data. This way, it avoids
activity—can offload your database significantly. extraneous writes to the cache for records that are
Plus, adding cache servers is more cost-effective than infrequently (or never) accessed or that change
adding additional database servers. before being read. This pattern is so ubiquitous that
most major web development frameworks, such as
Memcached is a high-speed, memory-based key- Ruby on Rails, Django, and Grails, include plugins
value store. It’s the gold standard for caching. In that wrap this strategy. The downside is that when
recent years, Redis has also become extremely data changes, the next client that requests it incurs a
popular because it offers advanced data types cache miss, resulting in a slower response time. This is
and similar performance to Memcached. Both because the new record needs to be queried from the
Memcached and Redis perform well on AWS. You database and populated into cache.
can install Memcached or Redis on EC2 instances,
or you can use Amazon ElastiCache for Redis— This leads us to the second-most prevalent caching
the AWS managed caching service. Like Amazon strategy. For data you know will be accessed
RDS and Amazon DynamoDB, Amazon ElastiCache frequently, populate the cache when records are
completely automates the installation, configuration, saved to avoid unnecessary cache misses. This results

Caching
and management of Memcached and Redis on AWS. in faster, more uniform client response times. Simply
For more details on setting up Amazon ElastiCache, populate the cache when you update the record
see What Is Amazon ElastiCache for Redis? in the rather than when the next client queries it. The
Amazon ElastiCache User Guide. tradeoff here is that if your data is changing rapidly,
it can result in an unnecessarily high number of cache
For gaming, adding a caching layer in To simplify management, ElastiCache servers are writes. And writes to the database can appear slower
grouped in a cluster. Most ElastiCache operations (like to users because the cache also needs to be updated.
front of your database for frequently
configuration, security, and parameter changes) are
used data can alleviate a significant performed at the cache cluster level. Despite the use To choose between these two strategies, you need
number of scalability problems. of the cluster terminology, ElastiCache nodes don’t to know how often your data is changing versus how
talk to each other or share cache data. And it deploys often it’s being queried.
the same versions of Memcached and Redis that you
would download yourself, so existing client libraries The final popular caching alternative is a timed
written in Ruby, Java, PHP, Python, and more are refresh. This is beneficial for data feeds that span
compatible with ElastiCache. multiple different records, such as leaderboards
or friend lists. In this strategy, you would have a
The typical approach to caching is known as lazy background job that queries the database and
population or cache aside. This means the cache refreshes the cache every few minutes. This decreases
is checked, and if the value isn’t in cache (a cache the write load on your cache and enables additional
miss), the record is retrieved, stored in cache, and caching to take place upstream (for example, at the
returned. Lazy population is the most prevalent CDN layer) because pages remain stable longer.
39
4.1 REL ATIONAL VS. NOSQL DATABASES

Amazon ElastiCache scaling

Amazon ElastiCache simplifies the process of scaling In general, monitoring hits, misses, and evictions Amazon ElastiCache for Redis version 3 and higher
your cache instances up and down. It provides access is sufficient for most applications. If the ratio of supports sharded clusters. You can create clusters with
to a number of Memcached metrics in Amazon hits to misses is too low, you should revisit your up to 15 shards, expanding the overall in-memory data
CloudWatch at no additional charge. Based on these application code to ensure your cache code is working store to more than 3.5 TiB. Each shard can have up to
metrics, you can set Amazon CloudWatch alarms as expected. As mentioned, evictions should typically five read replicas, allowing you to handle 20 million
to alert you to cache performance issues. You can be zero 100 percent of the time. If this isn’t the case, reads and 4.5 million writes per second.
configure these alarms to send emails when the either scale up your ElastiCache nodes to provide more
cache memory is almost full or when cache nodes memory capacity or revisit your caching strategy to The sharded model, in conjunction with the read
are taking a long time to respond. We recommend ensure you’re only caching what you need to. replicas, improves overall performance and availability.
monitoring the following metrics: Data is spread across multiple nodes, and the read
You can also configure your cache node cluster to replicas support rapid, automatic failover in the event
CPUUTILIZATION span multiple Availability Zones, providing high that a primary node has an issue.
This is the amount of CPU used by Memcached or availability for your game’s caching layer. In the
Redis. Elevated CPU may be indicative of an issue. event of an Availability Zone being unavailable, this To take advantage of the sharded model, you need to
prevents your database from being overwhelmed by use a cluster-aware Redis client. The client will treat
EVICTIONS a sudden spike in requests. When creating a cache the cluster as a hash table with 16,384 slots spread
This is the number of keys that must be forced cluster or adding nodes to an existing cluster, you can equally across the shards, and it will map the incoming
out of memory due to lack of space. This number choose the Availability Zones for the new nodes. You keys to the proper shard. Amazon ElastiCache for
should be zero. If it’s not near zero, you need a can either specify the requested number of nodes in Redis treats the entire cluster as a unit for backup and
larger ElastiCache instance. each Availability Zone or select the option to spread restore purposes—so you don’t have to think about or
nodes across zones. manage backups for the individual shards.
GETHITS/CACHEHITS AND
GETMISSES/CACHEMISSES With Amazon ElastiCache for Redis, you can create a
This is a measure of how frequently your cache read replica in another Availability Zone. If a primary
has the keys you need. The higher the percentage node fails, AWS provisions a new one. And when a
of hits, the more you’re offloading your database. primary node can’t be provisioned, you can decide
which read replica to promote to be the new primary.
CURRCONNECTIONS
This is the number of clients currently connected.
It excludes connections from read replicas.
4.2
40

Binary game content


with Amazon S3
Your database is responsible Amazon S3 is ideally suited for a variety of gaming While you could theoretically store this type of It’s often useful to make the binary data
use cases, including the following: data in a database, using Amazon S3 has a number searchable in a NoSQL DB or service like Amazon
for storing user data, including
of advantages, including the following: Elasticsearch Service to provide fast, scalable
accounts, stats, items, purchases, access to files and binary data. With these factors
and more. CONTENT DOWNLOADS in mind, let’s look at the aspects of Amazon S3
Game assets, maps, patches, and betas ± Storing binary data in a database is memory that are most relevant for gaming.
and disk intensive, consuming valuable query
But for game-related binary data, Amazon S3 is a USER-GENERATED FILES resources.
better fit. Amazon S3 provides a simple HTTP-based Photos, avatars, user-created levels,
API to upload (PUT) and download (GET) files. With and device backups ± Clients can directly download the content from
Amazon S3, you pay only for the amount of data you Amazon S3 using a simple GET operation.
store and transfer. Simply create a bucket to store ANALYTICS
your data in and make HTTP requests to and from that Storing metrics, device logs, ± Amazon S3 is designed for 99.999999999 percent
bucket. For a walkthrough of the process, see Creating and usage patterns durability and 99.99 percent availability of objects
a bucket in the Amazon S3 Getting Started Guide. over a given year.
CLOUD SAVES
Game save data and syncing between devices ± Amazon S3 natively supports features such as
(AWS AppSync would also be a good choice) ETag, authentication, and signed URLs.

± Amazon S3 plugs into the Amazon CloudFront


CDN, so you can quickly distribute content to large
numbers of clients.
41
4.2 BINARY GAME CONTENT WITH AMAZON S3

Players expect an ongoing stream of new characters, Easy versioning with ETag
levels, and challenges for months—if not years—
after a game’s release. The ability to deliver this Amazon S3 supports HTTP ETag and the If-None-
content quickly and cost-effectively has a big impact Match HTTP header, both of which are well known to
on the profitability of a DLC strategy. web developers but frequently overlooked by game

Content
developers. These headers enable you to send a request
The game client itself is typically distributed through for a piece of Amazon S3 content and include the MD5
a given platform’s app store. Pushing a new version checksum of the version you already have. If you already
of a game just to make a new level available can be have the latest version, Amazon S3 responds with an

delivery and
onerous and time consuming. Promotional or time- HTTP 304 Not Modified status code (or an HTTP 200
limited content, such as Halloween-themed assets status code along with the file data) if you need it. For
or a long weekend tournament, are usually easier to an overview of this call flow, read about typical usage
manage yourself in a workflow that mirrors the rest of HTTP ETag.

Amazon of your server infrastructure.

If you’re distributing content to a large number of


Using ETag in this manner makes any future use of
CloudFront more powerful because CloudFront also

CloudFront
clients (for example, a game patch, expansion, or supports the Amazon S3 ETag. For more information,
beta), we recommend using Amazon CloudFront see Request and Response Behavior for Amazon S3
in front of Amazon S3. CloudFront has points of Origins in the Amazon CloudFront Developer Guide.
presence (POP) located throughout the world, which
improves download performance. And you can Amazon CloudFront offers a Geo Targeting feature that
optimize costs by choosing which Regions CloudFront allows you to restrict access to your content. It detects
From an engagement perspective, DLC is serves. For more information, access the Amazon the country your customers are located in and forwards
CloudFront FAQs and refer to the question: How the country code to your origin servers. Your origin
a huge aspect of modern games, and it’s
does CloudFront lower my costs? server can then determine the type of personalized
becoming a primary revenue stream. content that will be returned to the customer based on
If you anticipate significant Amazon CloudFront their geographic location. This content can be anything
usage, contact our sales team. Amazon offers from a localized dialog file for a role-playing game to
reduced pricing that’s even lower than our on- localized asset packs for your game.
demand pricing for high-usage customers.

See how Gameloft securely delivers


engaging, low-latency content to its players.
42
4.2 BINARY GAME CONTENT WITH AMAZON S3

There are two strategies for uploading to Amazon S3.


You can either upload directly to Amazon S3 from
the game client or upload by first posting to your
REST API servers and then having your REST servers
upload to Amazon S3. While both methods work, we
recommend uploading directly to Amazon S3 because
it offloads work from your REST API tier (Figure 7).

Uploading directly to Amazon S3 is straightforward

Content
and can even be accomplished directly from a web
Using Amazon S3 POST Using Amazon S3 POST
browser. For more information, see Browser-based
uploads using POST (AWS signature version 2) in
the Amazon S3 Developer Guide. You can also create

upload with secure URLs for players to upload content (say from Amazon S3 Amazon S3

an out-of-game tool) using presigned URLs.


File transfer

Amazon S3
To protect against corruption, consider calculating
an MD5 checksum of the file and including it in the
Content-MD5 header. This will enable Amazon S3 to File transfer

automatically verify the file wasn’t corrupted during


Your web server Your web server
upload. For more information, see PutObject in the
Amazon S3 API Reference.
Other gaming use cases for Amazon S3
File transfer
Web request

User-generated content (UGC) is a great use case


revolve around uploading data from
for uploading data to Amazon S3. Typical UGC has
your game, be it user-generated content, two parts: binary content (for example, a graphic
analytics, or game saves. asset) and its metadata (for example, name, date,
Your customer Your customer
author, and tags). The usual pattern is to store the
binary asset in Amazon S3 and store the metadata
in a database. You can then use the database as
your primary index of available UGC that others can
download. We’ve provided an example call flow of Figure 7: An upload
using Amazon S3 POST
how you can upload UGC to Amazon S3 (Figure 8).
43
4.2 BINARY GAME CONTENT WITH AMAZON S3

In this example, you PUT the binary game asset This simple call flow handles the case where
(for example, the avatar or level) to Amazon S3, the asset data is stored verbatim in Amazon S3,
which creates a new object in Amazon S3. After you which is usually true of user-generated levels or
receive a success response from Amazon S3, you characters. This same pattern works for game
make a POST request to our REST API layer with saves as well—you store the game save data
the metadata for that asset. The REST API needs to in Amazon S3 and index it in your database by
have a service that accepts the Amazon S3 key name user_id, date, and any other important metadata.
plus any metadata you want to keep. Then, it stores If you need to do additional processing of an
the key name and the metadata in the database. Amazon S3 upload (for example, generating
The game’s other REST services can then query the preview thumbnails), make sure to read about
database to find new content, popular downloads, asynchronous jobs in the next section. There, you’ll
and so on. learn about adding Amazon SQS to queue jobs to
handle these types of tasks.

Figure 8: A simple workflow for


transfer of game content

HTTP/JSON Data

1. PUT multi-part file data to Amazon S3

2. HTTP 200 OK from Amazon S3


Amazon
S3 bucket

3. POST metadata to REST API tier 4. INSERT data + M


Amazon S3 key

Elastic Load REST API Master DB


Balancing Instances
44
4.2 BINARY GAME CONTENT WITH AMAZON S3

Analytics and A/B testing After you identify the data, follow these steps to track it: For both analytics and A/B
testing, the data flow tends
Collecting data about your game is one of the 1. Collect metrics in a local data file on the player’s 3. For each file you upload, put a record somewhere
most important, easiest things you can do. device (for example, mobile, console, or PC). To indicating there’s a new file to process. Amazon to be unidirectional.
Perhaps the trickiest part is deciding what to make things easier, we recommend using a CSV S3 event notifications provide an excellent way
collect. Because Amazon S3 storage is cheap, format and a unique file name. For example, a to support this. To enable notifications, first That is, metrics flow in from users and are
consider keeping track of any reasonable given user might have their data tracked in 241- add a notification configuration identifying the processed, and then a human makes decisions
player metrics you can think of (for example, game_name-user_idYYYYMMDDHHMMSS.csv or events you want Amazon S3 to publish, such that impact future content releases or game
total hours played, favorite characters or something similar. as a file upload and the destinations where you features. With A/B testing, when you present
items, and current and highest level) if you’re want Amazon S3 to send the event notifications. players with different items, screens, and so
not sure what to measure or have a client 2. Periodically persist the data by having the client We recommend Amazon SQS, as you can have a forth, you can make a record of the choice
that’s not easily updated. upload the metrics file directly to Amazon S3. background worker listen to Amazon SQS for new they were given along with their subsequent
Alternatively, you can integrate with Amazon files and process them as they arrive. For more actions (such as purchase or cancel). Then, you
However, if you’re able to formulate Kinesis and adopt a loosely coupled architecture, details, see the Amazon SQS section. can periodically upload this data to Amazon S3
questions you want answered beforehand as discussed in the next chapter. When you go to and use Amazon EMR to create reports. In the
(or if client updates are easy), you can focus upload a given data file to Amazon S3, open a new 4. As part of a background job, process the data using simplest use case, you can generate cleaned
on gathering the data that helps you answer local file with a new file name. This simplifies the a framework like Amazon Elastic MapReduce up data from Amazon EMR in CSV format in
those specific questions. upload loop. (Amazon EMR) or another framework that you another Amazon S3 bucket and load it into a
choose to run on Amazon EC2. This background spreadsheet program.
process can look at new data files that have
been uploaded since the last run and perform The proper treatment of analytics and Amazon
aggregation or other operations on the data. (Note: EMR is beyond the scope of this guide.
If you’re using Amazon EMR, you may be able
Upload Processing
to skip step 3 because Amazon EMR has built-in For more information, see Data Lakes and
support for streaming new files.)
2. PUT file to Amazon S3 Analytics on AWS and the Best Practices for
5a. EMR cluster Amazon EMR guide. To contact us, please fill
5. Optionally, feed the data into Amazon Redshift
3. HTTP 200 OK from Amazon S3 out the form at the AWS Game Tech website.
1. Write local for additional data warehousing and analytics
metrics file Amazon S3 EMR
bucket OR flexibility. Amazon Redshift is an ANSI SQL-
compliant, columnar data warehouse that you pay
for by the hour. This enables you to perform queries
across large volumes of data, such as sums and
5b. Non-EMR workflow
min/max, using familiar SQL-compliant tools.
DynamoDB EC2 workers
OR Repeat these steps in a loop, uploading and
processing data asynchronously (Figure 9).
4. PUT new Amazon S3 key is ready

SQS
Figure 9: A simple pipeline for
analytics and A/B testing
45
4.2 BINARY GAME CONTENT WITH AMAZON S3

Amazon Athena There are, however, a few things to keep in mind to optimize performance while using Athena for your queries, including:

Gleaning insights quickly and cheaply is one


of the best ways developers can improve AD-HOC QUERIES PROPER PARTITIONING UNDERSTANDING PRESTO
on their games. Traditionally, this has been
relatively difficult because data should typically Because Athena is priced at a base of $5 Partitioning data divides tables into Athena uses Presto, an open-source
be extracted from game application servers, per TB of data scanned, this means you parts that keep related entries together. distributed SQL query engine for running
stored somewhere, transformed, and loaded incur no charges when there are no queries Partitions act as virtual columns. You define interactive analytic queries against
into a database in order to be queried later. This being run. Athena is ideally suited for them at table creation, and they can help data sources of all sizes (ranging from
process can take a significant amount of time running queries on an ad-hoc basis when reduce the amount of data scanned per gigabytes to petabytes). An under-the-
and compute resources, greatly increasing the information needs to be gleaned from query, thereby improving performance covers understanding of Presto can help
cost of running such tasks. data quickly without running an extract, and reducing the cost of any particular you optimize the various queries you run
transform, and load (ETL) process first. query. You can restrict the amount of data on Athena. For example, the ORDER BY
Amazon Athena assists with your analytical scanned by a query by specifying filters clause returns the results of a query in sort
pipeline by providing the means of querying based on the partition. For example, in order. To perform the sort, Presto must
data stored in Amazon S3 using standard COMPRESSION the query SELECT count(*) FROM lineitem send all rows of data to a single worker and
SQL. Because Athena is serverless, there’s no WHERE l_gamedate = '2019-10-31', a non- then sort them. This can cause memory
infrastructure to provision or manage, and Just like partitioning, proper compression partitioned table would have to scan the pressure on Presto, which can result in the
generally, there’s no requirement to transform of data can help reduce network load and entire table, looking through potentially query taking a long time to execute. Or,
data before applying a schema to start querying. costs by reducing data size. It’s also best millions of records and gigabytes of data. even worse, the query could fail. If you’re
to ensure the compression algorithm you This slows down the query and adds using the ORDER BY clause to look at the
choose allows for files to be split, so the unnecessary costs. A properly partitioned top or bottom N values, use a LIMIT clause
Athena execution engine can increase table can help speed queries and to significantly reduce the cost of the sort
parallelism for additional performance. significantly reduce costs by cutting the by pushing the sorting and limiting to
amount of data queried by Athena. For a individual workers (rather than having a
detailed example, see Top 10 Performance single worker do the sorting).
Tuning Tips for Amazon Athena on the
AWS Big Data Blog.
5.0
46

Scaling game services


as asynchronous jobs
47
5.0 SCALING GAME SERVICES AS ASYNCHRONOUS JOBS

Decoupling components of your


game’s architecture enables your
server components to operate as
independently as possible.
A common approach is to put queues between services,
so sudden bursts of activity on one part of your system
don’t cascade to other parts. Some aspects of gaming are
difficult to decouple because data needs to be as up-to-date
as possible to provide a good matchmaking and gameplay
experience. However, most data, such as cosmetic or character
data, doesn’t have to be up to the millisecond.
48
5.0 SCALING GAME SERVICES AS ASYNCHRONOUS JOBS

For example, a player needs updated stats in real This is a highly effective way to decouple your front-
time, so they won’t lose progress if they exit and re- end servers from backend processing, and it enables
enter the game. However, re-ranking the global top you to scale the two independently. For example, if
100 leaderboard doesn’t need to occur every time a the image resizing is taking too long, you can add
player posts a new high score. Instead, the ranking additional job instances without the need to scale
process could be decoupled from score posting and your REST servers.
performed in the background every few minutes.
Because game ranks are highly volatile in any active Choosing a serverless approach can also help
online game, this would have minimal impact on the alleviate backend management through event-driven
game experience. triggers. AWS Lambda functions are used here as an
entry point to all of your AWS resources, providing

Leaderboards
As another example, consider allowing players to a layer of abstraction between your game and AWS
upload a custom avatar for their character. In this features. For more information about serverless use
scenario, your front-end servers place a message cases for games, see our reference architectures for
into a queue (like Amazon SQS) about the new mobile real-time analytics and push notifications.

and avatars
avatar upload. You write a background job that runs
periodically, pulls avatars off the queue, processes (Note: You can also implement this pattern with an
them, and marks them as available in MySQL, alternative such as RabbitMQ or Apache ActiveMQ
Amazon Aurora, Amazon DynamoDB, or whichever deployed to Amazon EC2.)
database you’re using. The background job runs on
Many gaming tasks can be decoupled a different set of EC2 instances that can be set up to
automatically scale just like your front-end servers.
and handled in the background.
To help you get started quickly, AWS Elastic Beanstalk
provides worker environments that simplify this
process by managing the Amazon SQS queue and
running a daemon process on each instance that
reads from the queue for you.
49
5.0 SCALING GAME SERVICES AS ASYNCHRONOUS JOBS

For more information, see Getting started with INCREASE VISIBILITY


Amazon SQS in the Amazon Simple Queue Service The default redelivery time when a message
Developer Guide. The following tips can help you is not deleted is 30 seconds. If you have long-
take advantage of what Amazon SQS has to offer: running jobs, you might need to increase this
time to avoid multiple queue readers from
receiving the same message.
MAKE WRITES FASTER
To make writes as fast as possible, you can TARGET UNSUCCESSFUL MESSAGES
create your SQS queues in the same AWS Amazon SQS also supports dead-letter queues.
Region as your API servers. Your asynchronous A dead-letter queue is a queue that other
job workers can live in any Region because (source) queues can target for messages that
they’re not time-dependent. This enables you can't be processed (consumed) successfully. You
to run API servers in Regions near your players can set aside and isolate these messages in the
and job instances in more economical Regions. dead-letter queue to determine why processing
didn't succeed.

Amazon SQS
SCALE HORIZONTALLY
Amazon SQS is designed to scale horizontally. Amazon SQS has the following caveats:
An Amazon SQS client can process about 50
requests per second. The more Amazon SQS ± Messages aren’t guaranteed to arrive in order.
client processes you add, the more messages You might receive messages in random order (for
Amazon SQS is a fully managed you can process concurrently. For tips on adding example, 2, 3, 5, 1, 7, 6, 4, 8). If you need strict
additional worker processes and EC2 instances, ordering of messages, review the First-In, First-
queue solution with a long-polling
see Increasing throughput using horizontal Out (FIFO) queues section that follows.
HTTP API, making it easy to interface scaling and action batching in the Amazon
with regardless of the server Simple Queue Service Developer Guide. ± Messages typically arrive quickly, but a message
might occasionally be delayed by a few minutes.
languages you’re using. REDUCE COSTS
You can save money using Amazon EC2 Spot ± Messages can be duplicated, and it's the responsibility
Instances for your job workers. Amazon SQS of the client to de-duplicate messages.
is designed to redeliver messages that aren’t
explicitly deleted, which protects against EC2 This means you should ensure your asynchronous
instances disappearing mid-job. You should jobs are coded to be idempotent and resilient to
only delete messages after you have completed delays. Resizing and replacing an avatar is a good
processing them, so another EC2 instance can example of idempotence because doing this twice
retry the job if a given instance fails while running. would yield the same result.
50
5.0 SCALING GAME SERVICES AS ASYNCHRONOUS JOBS

Finally, if your job workload scales FIFO queues Other queue options
up and down over time (for example,
The recommended method for using Amazon SQS In addition to Amazon SQS and Amazon SNS, there
perhaps more avatars are uploaded when is to engineer and architect your application to are dozens of other message queue approaches—
more players are online), consider using be resilient to disordered and duplicate queues. including RabbitMQ, ActiveMQ, and Redis—that can
However, you might have certain tasks where run effectively on Amazon EC2. With all of these
Auto Scaling to launch Spot Instances. duplicates can’t be tolerated because the ordering of approaches, you’re responsible for launching and
messages is absolutely critical to proper functioning. configuring a set of EC2 instances, which is outside
Amazon SQS offers multiple metrics that For example, you might allow micro-transactions the scope of this guide. Keep in mind that running a
you can use for Auto Scaling, the best being for a player who wants to buy a particular item reliable queue is much like running a highly available
ApproximateNumberOfMessagesVisible. The number once, and this action must be strictly regulated. To database. You should consider high-throughput disk
of visible messages is basically your queue backlog. supplement this type of requirement, FIFO queues (such as Amazon EBS PIOPS), snapshots, redundancy,
are available in select AWS Regions. FIFO queues can replication, failover, and more. Ensuring the uptime
For example, depending on the number of jobs you process messages in order and exactly once. Due to and durability of a custom queue solution can be
can process each minute, you could scale up when the the emphasis on message order and delivery, there time consuming and can fail at the worst times (for
number of visible messages hits 100 and then scale are additional limitations when working with FIFO instance, during your highest load peaks).
back down when that number falls below 10. For queues. For more details about FIFO queues, see
more information about Amazon SQS, see Monitoring Amazon SQS FIFO queues in the Amazon Simple
Amazon SQS queues using CloudWatch in the Queue Service Developer Guide.
Amazon Simple Queue Service Developer Guide.
6.0
51

Getting
started
52

We covered a The following are major takeaways of scalable game development patterns and some
simple steps you can take to begin your game’s journey on AWS:

lot of ground Start simple with two EC2 instances behind an


Elastic Load Balancing load balancer. Choose
either Amazon RDS or Amazon DynamoDB as
If database performance becomes an issue,
add read replicas to spread the read/write
load out. Evaluate whether a NoSQL store

in this guide. your database. And consider using AWS Elastic


Beanstalk to manage this backend stack.
like Amazon DynamoDB or Redis can be
added to handle certain database tasks.

Store binary content, such as game data, At extreme loads, determine whether
assets, and patches, on Amazon S3. Use advanced strategies (such as event-driven
Amazon S3 to offload network-intensive servers or sharded databases) are necessary.
downloads from your game servers. If you’re However, wait to implement these until
distributing these assets globally, consider it’s absolutely necessary to avoid adding
Amazon CloudFront. complexity to development, deployment,
and debugging.
Always deploy your EC2 instances and databases
to multiple Availability Zones for the best
availability—it’s as easy as splitting your AWS has a team of business and technical pros who
instances across two Availability Zones to start. are dedicated to supporting our gaming customers. If
you’re ready to talk to us about building your game
As your server load grows, add caching via on AWS, complete our contact form. A member of the
Amazon ElastiCache. Create at least one AWS Game Tech team will reach out to discuss your
ElastiCache node in each Availability Zone requirements and AWS Support options.
where you have application servers.

As load continues to grow, offload time-


intensive operations to background tasks
using Amazon SQS or another queue like
RabbitMQ—so your EC2 app instances and
database can handle a higher number of
concurrent players.
INTRODUCTORY GUIDE TO SCAL ABLE GAME DEVELOPMENT ON AWS

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy