Building High Performance, Scalable Web Applications
Building High Performance, Scalable Web Applications
If you have been working in web development for any sizable period of time, chances are you have
been asked to make sure that the application you are building is scalable. Building a web application
that performs well at any scale is becoming a more common expectation of business owners around Subscribe
the world today.
The word scalable is often (wrongly) used to mean that the application should work well (respectable
response times) at a high load. In truth, that is a high performance application. You could throw a lot
of resources at any (even a poorly designed) application and make it perform well. A scalable
application is one that can accomodate the increase (or decrease) in the load without any downtime.
An application might perform perfectly at the current load but if it is not able to keep up with an
increase in load it can not be called scalable. The two concepts are closely related, but not
interchangeable.
Thinking about performance from the start is a great way to make sure you don’t nd yourself in a
tough spot halfway through. Some of the questions that you should ask at this stage are,
There are a few design guidelines you can follow to make sure your application performs well.
Caching
Databases are probably the most common point for failures for most applications. Of course, each
application is different and if you are doing tons of image or any sort of le manipulations then that
might become a bottleneck even before the database shows any sign of problems.
Implementing a cache can reduce the load on your database drastically, if your application is making
the same queries over and over. There are different types of caches, for example you could use a
proxy cache and serve common HTTP requests even without touching your application server. You
could also use an object cache to store processed data received from database queries. The key thing
to focus on here is that you should only cache things that have reuse value. If you store something in
a cache that’s only going to be used once after that, it’s not going to be a very major optimization.
While working on a custom CMS for a website that got more than a million visitors a day, we
implemented a simple proxy cache and it brought the CPU load from 70% down to 10%.
So, think about a caching plan while you are designing the application and you will de nitely see a lot
of improvements.
Every application has some heavy lifting sections and some light ones. For example a logout request
might not be doing much other than invalidating the session token. On the other hand consider an
application that allows users to connect all their social pro les, import their data and process it to
show some reports. Doing this would involve making multiple network requests depending on the
number of pro les and the amount of data, and because these are external services you have no
control over their response times. Performing this operation in realtime would put a lot of unnecessary
load on your application. A better way of doing this would be to have a script running in the
background that periodically checks for new requests, does all the heavy lifting and noti es your
application once the operation is complete. Message queues are commonly used in such situations,
but even a simple script running in the background would be better than doing such operations in
realtime.
Whenever in doubt about a heavy operation, just ask this question. Would the end user have any
problem if this operation is completed a minute later?
All complex applications are made up of many different components. For example, there might be a
video processing component that only works when a user uploads a new video and generates les in
different formats/sizes. Or there might be a reporting component that fetches a lot of data, performs
operations on it and generates some reports. Building your application in a decoupled manner will
allow you to allocate proper resources to the components that need them, when they need them.
Similarly there might be a lot of different services as well. There is a web server serving your
application les, there is a database, there might (should) be a cache, there might be a message
queue. All these different services have different requirements from the system. If you run all of them
on a single machine it would be very dif cult to identify which services need more resources. For
example, if the amount of data increases the database and the cache would require more RAM but
the web server can still operate with the same resources. Having all the services running on different
machines (or VPS) would make it a lot easier to manage and allocate the required resources.
Building stateless services is important because once you are ready to scale your application you
need services that can be started or stopped without any impact on the overall system. Scaling out a
stateful service is comparably a lot more work. A common example of this is, let us say you have user
session information stored on your application server. Now if the load increases and you decide to
add more servers to handle the increased load, you will also have to make sure that the session
information is synced across all the servers, or that a given user always goes to the same server. Both
of which are additional overheads that can be avoided if you build stateless services. There are some
services that are inherently stateful, like databases and scaling them is always more complicated
than their stateless counterparts.
Know that things will go wrong. Believing that every part of your system can and probably will fail is
important, because it forces us to think about the steps that can be taken to mitigate the issue as
much as possible. Ask yourself questions like these,
Understand that fault tolerance comes with a price. You need to decide how much cost you can afford
based on the value.
This is in no way a complete list. Experts far more knowledgeable than me have written books that
cover just this topic, but this should get you started.
Once you have built a system that performs well, you can start thinking about making it scalable.
Scalability has a price, it takes time and effort. If you are just building a prototype there is (probably)
no point in making it scalable up to a million users. Most applications like facebook or twitter have
been rewritten multiple times.
If you are building a system for a client, let them know that it would add to the cost of the project.
There is going to be a big difference in the cost of a system that can serve 100 realtime users versus
one that can serve a million. Keep this transparent from the start because if the application gets a lot
of traction and the system is not able to handle the load, you are going to be in a tough spot.
A scalable system is one that keeps performing well even as the number of users increases. That
would involve adding/ removing servers based on the traf c. The rst step for this would be knowing
when to scale.
Build watchtowers
Monitoring all the systems/components manually is going to be very dif cult if not impossible. You
need to build dashboards that can provide you with a drillable overview of the system’s
current/historical state. You will also need to set up alarms that can be triggered when the load
reaches a particular level. Doing this is far easier if you are using a cloud provider like AWS versus if
you are implementing it on your own using open source softwares. For AWS you can use Cloud
Watch and implement auto-scaling. If you are doing it on your own you can use something like
Nagios to setup the dashboard and use scripts to create new servers and add them to the load
balancer.
If you are using AWS make sure that you have added a spending limit on your account or might wake
up to a 10,000$ bill one morning.
If you follow the ideas mentioned above and build a stateless system, it is very straightforward to add
a load balancer that can direct the load to different servers. You can keep on adding and removing the
application servers based on the load. And if you built a decoupled system, you can choose to scale
just the components that are experiencing heavy load.
Cost of scalability
You need to take a call whether you need a scalable system right now. There are some basal costs
associated with building a scalable system. If your entire application can run on a 5$ VPS at your
current load, dividing it on different servers is going to bring the cost up. If you just need one
application server right now, putting a load balancer in front of it is going to double the cost. Ideally,
you should run load tests and check the limit of your system - preferably in a staging environment and
plan accordingly.
To sum it up, if your system is built with a focus on performance, making it scalable is going to be a lot
easier in comparison. The architecture of the application is what makes it easier or dif cult to scale.
There are costs associated with a scalable system, and you should discuss them with the business
owner or the client.
All this is barely scratching the surface. But it can be an entry point for someone who is starting to
focus on scalability and performance.
HashJar
Powered by Ghost