
Businesses can’t afford to go offline, not even for a minute. That’s where High Availability and Scalability come in. In this post, we’ll explore what they mean and how we apply them across our infrastructure.
What is High Availability?
High Availability refers to how available or accessible your servers or services are.
Although aiming for 100% might seem obvious, in reality, achieving it is harder than expected. Trying to get as close as possible is often the best approach.
Getting 100% means your servers or services remain accessible no matter what occurs.
Even AWS doesn't guarantee 100% availability for all of their services, but they do try to get as close as possible.
High Availability metrics often account for potential faults and events where something goes wrong, such as server failures or Availability Zone (AZ) outages. It's also affected by planned limitations, like services only running during business hours.
In AWS, the approach is to design every service with High Availability in mind. This means that everything has a strong foundation built on High Availability principles, regardless of the specific services you use. The services AWS runs and manages are designed this way to minimize downtime.
How do we make use of High Availability?
Although not all of our servers are hosted in AWS, we make use of both Vultr and AWS.
We mainly use Vultr for our Management Servers, with servers located across the world, focusing on:
- Europe
- North America
- Australia
- Africa (specifically South Africa)
All of our servers in Vultr and AWS are behind load balancers (LB), with nodes placed in different Availability Zones (AZs) for high availability and uptime. So if one of the nodes has a problem, the others take over.
In AWS, we host our WAN-Failover servers, Lambda functions, web hosting, and more. Again, this is all done with High Availability in mind, leveraging the tools that AWS provides.
A bonus of using load balancers, especially in AWS, is that the load balancer service itself can often auto-scale as your needs grow. So, even if you start with just two nodes in two AZs, the load balancer scales to handle influxes of requests as needed.
Some services are fully managed by AWS, meaning we don't have direct access to the underlying infrastructure. AWS handles the HA aspects as part of the service, giving us and our customers peace of mind in case something goes wrong.
Our WAN-Failover servers are hosted across two different Availability Zones (AZs). AWS manages and maintains its own AZs (data centers). Each AWS region typically has at least three AZs, all interconnected with AWS's own high-speed fiber infrastructure.
Our WAN-Failover servers are placed behind a load balancer (LB). This distributes the load as evenly as possible and ensures that if one server fails for any reason, the remaining servers continue handling requests.
While High Availability focuses on keeping services accessible during failures, Scalability ensures they perform well under varying, especially high, workloads.
What is Scalability?
Scalability typically has two main types:
- Vertical scalability
- Horizontal scalability
In terms of servers, vertical scalability is when you upgrade your server specs.
So moving from 2 vCPU and 2 GB of RAM to 4 vCPU and 4 GB of RAM helps with handling workloads that need the extra resources.
Although this is helpful, there's a limit to how much you can scale a single server, usually due to hardware constraints. Vertical scaling can also become costly. Furthermore, relying on a single, powerful server (sometimes called a monolithic architecture) creates a single point of failure. If something were to go wrong on that server (like a failed update or a hardware fault), the entire service could be affected.
This is where Horizontal Scalability comes into play.
Horizontal scaling, in terms of servers, generally means adding more servers to handle the workload.
So, if you have two servers handling a workload, horizontal scaling means adding more servers (e.g., a third, fourth, etc.) rather than upgrading the specs of the existing two.
Horizontal scalability is very important and powerful because it distributes the workload across a larger 'workforce' of servers. If something were to happen to one server, the others are able to pick up the extra work, resulting in less potential downtime for clients or users.
Load balancers are a great example enabling this. You can add numerous servers behind a load balancer, and it will distribute incoming requests as equally as possible among the healthy ones. If a server becomes unhealthy, the load balancer automatically stops sending requests to it. By placing these servers in different AZs within the same region, you combine horizontal scalability with high availability. If one server or even an entire AZ fails, services remain operational for clients. Just to give one example of how important high availability and scalability truly are!
Other services, like AWS Auto Scaling Groups, facilitate horizontal scalability by automatically adding or removing servers based on workload demands.
We utilize these techniques within our company to minimize client downtime, both during unexpected disasters and planned maintenance periods, along with leveraging services like Lambda, Amplify, and S3.
I hope this sheds some light on why High Availability and Scalability are important, and why choosing a provider that treats these concepts as foundational is crucial.
It provides peace of mind not only for you, as the user, but also for your clients. Accidents and incidents are inevitable; what matters is the preparation to handle these challenges. Because sooner or later, something WILL happen. How you respond to that situation can be a 'make or break' moment for your clients!
More articles you might like
How Altostrat Started
In mid-2024, I made the decision that I needed to move on from my beloved MikroCloud. When I wrote the first line of code for it back in 2019, I manag...
Read more