Capacity planning, horizontal scaling, vertical scaling, scale-up, scale-out, etc., are some of the buzzwords that you’ll frequently hear if you are in any way associated with a data center. In addition, vertical vs horizontal scaling is a popular debate that has been dividing people for some time.
While both methods come with pros and cons, it is important to identify your business requirements and rightly align them with the best scalability option to deliver highly available DevOps solutions to customers. This blog talks about the importance of scalability and how horizontal and vertical scaling models affect your business operations.
Before going into the vertical vs horizontal scaling debate, it is important to understand what scalability is. The scalability of an application is the measure of the number of client requests it can simultaneously handle. When a hardware resource runs out and can no longer handle requests, it is counted as the limit of scalability. When this limit of the resource is reached, the application can no longer handle additional requests. To efficiently handle additional requests, administrators should scale the infrastructure by adding more resources such as RAM, CPU, storage, network devices, etc. Horizontal and vertical scaling are the two methods implemented by administrators for capacity planning.
Scalability is a crucial requirement of a cloud environment. You need to dynamically increase or decrease IT capacity or size to meet changing business IT requirements and manage unexpected traffic spikes. It will reduce latency and improve performance while preventing downtimes.
Horizontal scaling is an approach of adding more devices to the infrastructure to increase the capacity and efficiently handle increasing traffic demands. As the name says, horizontal scaling is about expanding the capacity horizontally by adding extra servers. The load and processing power are shared among multiple servers within a system using a load balancer. It is also called scaling out.
Vertical scaling AWS is a type of scalability wherein more computing and processing power is added to a machine to increase its performance. Also called scale-up, vertical scaling allows you to increase the machine’s capacity while maintaining resources within the same logical unit. The processor, memory, storage, and network capacity are increased in this approach. A notable example is buying an expensive machine such as VMware ESXi as a bare-metal hypervisor.
When it comes to vertical vs horizontal scaling, the key difference lies in the way the hardware specifications of a device are enhanced to achieve scalability. In a vertical scaling model, the hardware configuration of the server is increased without altering the logical unit. In a AWS horizontal scaling model, the number of instances is increased without increasing the hardware specifications. Simply put, horizontal scaling is adding more machines, while vertical scaling is about adding more power.
Another difference is that the sequential piece of logic is broken into smaller pieces and executed in parallel across multiple devices in a horizontal scaling model. Administrators distribute tasks across different machines in the network via patterns such as MapReduce, Tuple Spaces, etc. MongoDB, Cassandra, etc., are used to manage data in this method.
When it comes to vertical scaling, the logic remains unaltered. The same code is executed on a higher-capacity machine. Multi-threading is used for concurrent programming that runs on multi-cores of the device processor. Amazon RDS and MySQL are normally used for this process.
Both vertical and horizontal scaling solutions come with pros and cons. As vertical scaling doesn’t change the logic, it is easy to implement and manage. The data resides on a single node and runs on multiple cores, and is easy to operate. With shared address spaces, you can easily and cost-effectively share data and messages by passing a reference. With a lesser footprint, power and cooling costs are reduced. While the software is cost-effective, IT administration of managing a single device becomes easy.
The downside of vertical scaling is that there is an upper limit for scalability. You can upgrade the machine to a certain configuration, and after that, you are limited with upgrade options. Scaling up will result in downtime as well as you have to shut down the device and move the application to a higher machine. A single device can result in a single point of failure as well.
When it comes to horizontal scaling, the capacity of a machine doesn’t matter. You can instantly add as many devices as you want without any downtime. It enhances resilience. You can distribute application instances across multiple systems and perform parallel execution with ease. However, the data is partitioned and runs on multiple devices. Because the data is shared across multiple nodes without a shared address space, it is a tough task to share and process data as you share a reference of the data and copies of that data.
Vertical Vs Horizontal Scaling | Vertical scaling | Horizontal Scaling |
Data | Data is executed on a single node | Data is partitioned and executed on multiple nodes |
Data Management | Easy to manage – share data reference | Complex task as there is no shared address space |
Downtime | Downtime while upgrading the machine | No downtime |
Upper limit | Limited by machine specifications | Not limited by machine specifications |
Cost | Lower licensing fee | Higher licensing fee |
Both vertical and horizontal scaling techniques can be applied to a single application wherein parts of the application will scale up while other parts scale-out.
When it comes to vertical vs horizontal scaling, it is not easy to decide which model to choose. Here are few considerations:
When your business caters to a global audience, you need to deliver applications across geographical regions. To efficiently manage geo-latency, disasters and downtimes, choose horizontal scaling. You can also deal with local regulatory compliance issues.
When you scale up, you don’t have the flexibility to choose optimal configurations for specific loads dynamically. The configuration power will limit the performance of the system. When you scale out, you can select the configuration to increase the performance and optimize costs. However, it is essential to check if a single device can handle that load. In that case, adding more power instead of deploying multiple machines for the same purpose is beneficial.
When you scale up, you are locked with a single device, resulting in a single point of failure. When you scale out, it will offer built-in redundancy. However, you need to consider the costs of running a single device versus many.
In some instances, you don’t have to stick to a particular scalable model. For example, if you use distributed storage systems in the data center, you would be switching between the distributed systems and the single disk mechanism. In such cases, you can try both vertical and horizontal scaling models so that you can easily switch between them. However, to make such a switch, your application should be designed with decoupled services so that some layers can be scaled up while others are scaled out.
Choosing between vertical vs horizontal scaling also depends on the application architecture. For instance, applications built using serverless architecture rightly suit horizontal scaling. When server-side tracking is involved, the session is tied to a specific server. With a serverless state, thousands of sessions of a single application can be deployed across multiple servers. So, horizontal scaling will be optimal.
Secondly, an application built on service-oriented architecture (SOA) will suit the distributed deployment of services across various systems. With a microservices architecture, the application becomes independent of data, web, application, and caching tiers so that you don’t have to scale up each component to the services tier demand levels.
When it comes to vertical scaling, consider the finite number of times you can scale up. When the machine reaches its upgradable limit, you have to purchase another machine. So, you should be prepared to scale resources every time traffic spikes up. Moreover, consider the downtime for scaling up and make sure that it doesn’t affect the business performance.
Modern multi-core processors are significantly cost-effective. So, instead of going for an entirely new device, upgrading the processor might give you the required speed and performance at a lower cost. At the same time, a 256-core server price is equal to the price of 30-40 4-core server machines. It is easy and cost-effective to scale up tens or hundreds of machines. However, when you scale up thousands of devices, the costs are enormous.
Vertical scaling fares well in the initial stages as there are fewer machines and less administrative overhead. However, as the configuration increases, designing multi-core architecture is a complex task that involves additional expenses.
Horizontal scaling removes the configuration upgrade costs in the beginning. However, as you scale out, it increases the machine footprint and thereby increases administration overhead costs. It all depends on where you stand in the scalability journey when it comes to vertical vs horizontal scaling.
AWS offers AWS Auto Scaling, an auto-scaling feature that enables you to dynamically and automatically scale different components of the architecture. This tool monitors the capacity and performance of your application and cost-effectively scales it to suit your pre-defined policy. You can scale up to multiple components as a group or choose to scale specific components. So this how to implement horizontal scaling in AWS.
The intuitive dashboard allows you to create a scaling plan for each app resource, such as Amazon Elastic Compute Cloud (EC2) instances, Aurora replicas, DynamoDB indexes and tables, Spot Fleets, and other EC2 tasks. It allows you to choose between costs and performance or balance both. In addition to the dashboard, AWS offers Command Line Interface (CLI) and Software Development Kit (SDK) to manage scaling plans.
You can create a backup using the Elastic Backup Store (EBS) instance before scaling resources as Amazon Machine Images (AMIs) or Snapshots. You can integrate EC2 Auto Scale with AWS Auto-Scaling feature to scale a variety of AWS resources.
How does Horizontal scaling in AWS works?
1) Define and configure your unified scaling policy for each app resource.
2) Ensure that the app has a system to add/modify/delete resources as per the changing requirements.
3) Identify specific services that can be scaled.
4) Choose your optimization strategy; Cost Vs Performance.
5) Monitor scaling load to get clear insights into capacity management.
AWS auto-scaling feature comes with multiple benefits
You can choose to create your optimization strategy as well.
Manually scaling up resources on AWS is pretty simple. To vertically scale a resource, simply change the size of the instance. For instance, if you are using a t2.medium instance, you can change it to a t2.large instance. Similarly, scaling down is about decreasing the size of the instance to t2.small, t2.micro, t2.nano, etc. The good thing is that you don’t have to define any scaling rules. However, you’ll have to face a minor downtime.
Automatic vertical scaling on AWS is not out-of-the-box. Earlier, AWS didn’t offer automatic vertical scaling. It will result in downtime when the new instance has to be moved to a new machine or restarted on a higher configuration machine. Now, AWS offers a round-about method to perform vertical scaling automatically. AWS Ops Automator is an AWS tool that helps you to manage your AWS infrastructure. With AWS Ops Automator V2, AWS has introduced the vertical scaling feature. It will adjust the capacity of the resource automatically at the lowest costs.
How does it work?
Here are the steps to set up vertical scaling on AWS:
1) Define an event or time for triggering the scaling event.
2) Choose whether to re-provision or resize the instance
3) Amazon CloudWatch monitors traffic loads and triggers the scaling task as per a predefined time or event.
4) AWS Lambda starts the scaling process. Based on your preference, it will resize the instance up or down or terminate the original instance and start new instances based on the scaling requirements.
Application scalability is the capacity of an application to handle multiple requests per minute. Consider a simple web application architecture wherein an application is hosted on a web server that is connected to a database server. The web server listens to client requests, contacts the database server, delivers the required information, or performs a task based on the client’s request. However, as the client base grows, thousands of clients will concurrently place requests to the webserver. The web server will run out of compute, storage, and I/O resources which means it’s time to ponder on vertical vs horizontal scaling.
When vertical scaling is applied, the CPU, RAM, and Storage resources are upgraded, keeping the architecture as it is. When the traffic grows further, you need to upgrade the configuration of the webserver again. However, you are limited with a finite number of upgrades. Secondly, when the webserver is single-handedly managing thousands of client requests, the performance can diminish over time.
In a horizontal scaling scenario, administrators increase the number of servers in the infrastructure to handle thousands of concurrent requests instead of increasing the server configuration. So, the traffic is distributed across multiple servers using a load balancer that takes the clients’ IP addresses and routes them to the available server using routing methods such as Round Robin. This setup will resolve the fault-tolerance issue and increase the performance of the web application. However, the load balancer can become a single point of failure. So, 2-3 load balancers are used to resolve this issue.
Another issue with distributed application environments is remembering sessions. For instance, when a client visits the website, the load balancer will route the client to server A, and that session is stored in server A. When the client revisits the website, the load balancer might send the client to Server C. In that case, the earlier session is not available with server C, and the client has to log in every time he visits the site. To resolve this issue, administrators add a Redis server that will store and manage sessions.
Now, the Redis server might become a single point of failure. So, you’ll have to add redundancy to the Redis server. While horizontal scaling offers flexibility, speed, and performance, distributing application layers across multiple web servers and database servers is difficult. You need to monitor and manage the setup carefully.
As with web applications, databases can be scaled up or scaled out to meet changing storage requirements. Databases are configured using a rack of servers. When databases grow, you need to increase the capacity of the data center. If you add more resources to the existing server racks, it is called vertical scaling. Here, data is executed on a single node, and you just need to share the data reference.
So, data management is easy. However, scalability is limited to fewer upgrades. When traffic grows, the database has to answer thousands of queries simultaneously. It is not feasible to use a single node as it will result in a single point of failure. Relational databases such as MySQL, SQL Server, Oracle usually suit vertical scaling.
Alternately, when adding new resources doesn’t serve the purpose, you need to add new servers implementing horizontal scaling. As the name says, horizontal scaling increases the data center capacity horizontally while vertical scaling increases it vertically. In horizontal scaling, data is partitioned and executed on multiple machines.
As such, there is no shared address space, and you have to share copies of data. However, you don’t have to worry about growing traffic or queries as requests are efficiently distributed across multiple systems. So, speed and performance are guaranteed. Non-relational databases such as NoSQL, Cassandra suit horizontal scaling.
Watch this video to learn how to build your multitenant database architecture!
Vertical Scaling | Horizontal Scaling |
Systems expand horizontally | Systems expand vertically |
More resources are added to existing systems | More server racks are added to existing systems |
Easy to implement and takes less time | Difficult to implement and involves more configurations and time |
Cost-effective as only new resources are added | New server racks involve huge costs |
Data is stored on a single node | Data is partitioned |
Single point of failure | No single point of failure |
Performance is affected when queries increases | Speed and performance are not affected by queries |
Suited for relational databases such as SQL | Suited for non-relational databases such as NoSQL |
You can also take a look at our slideshow for an overview of vertical vs horizontal scaling
The vertical vs horizontal scaling AWS debate has been around for some years now and it is here to stay for some time. So, the question is not about finding out what’s best but is about identifying the perfect scalability model for your IT requirements. It all depends on your application architecture, scalability requirements, and the current scalability level you are currently standing. AWS Horizontal scaling brings flexibility, elasticity, and unlimited resource availability to the table.
However, it also brings complex cluster setup, cluster management, communication and consistency issues, maintenance, and overhead costs. When the project scales out, the complexity increases too. So, check out if scaling up the server serves the purpose. However, please consider that the scalability is limited to the maximum upgradable capacity of the server.
Also read: Top AWS Security Tools
Vertical scaling is better when your application receives decent traffic. However, when the application has to cater to hundreds of thousands of concurrent requests, horizontal scaling is better as you can perform seamless scaling while gaining speed, elasticity, and performance. You don’t face a resource deficit.
It depends on the type of upgrade you perform and the level of scalability. A medium-level processor might be cheaper than a new machine. However, a high-end processor equals 20-30 regular machines. So, calculate your costs based on the type of upgrade you intend to perform. Similarly, horizontal scaling is cheaper initially but will incur overhead costs and licensing costs as the machines grow.
In AWS, vertical scaling is about changing the instance up and down, and horizontal scaling is about adding more machines of similar capacity to the infrastructure.
Using AWS, you can autoscale different resources such as Amazon EC2 Auto Scaling Groups, Amazon EC2 Spot Flees, Amazon Elastic Container Service (ECS), Aurora Replicas, and DynamoDB.
Have you ever wondered how businesses easily process enormous volumes of data, derive valuable insights,…
Discover the steps for developing cloud applications, from costs to cloud app deployment
Imagine launching your product with just the core features, getting honest user feedback, and then…
When a tight deadline is non-negotiable, every second counts! Here’s how we developed and launched…
You may have considered hiring a nearshore software development company or services, but you still have doubts…
End-to-end project management goes as far back as you can remember. Every project in history, even…