The Cluster Autoscaler (CAS) is a tool designed to automatically adjust the size of a Kubernetes cluster based on resource needs. It monitors pending pods and scales up or down accordingly. The CAS continuously monitors the API server for unschedulable pods and creates new nodes to host them. It also identifies underutilized nodes and removes them after migrating pods to other nodes.

Some key features of the Cluster Autoscaler include Kubernetes resource-conscious scaling, leveraging expanders for multiple node groups, and respecting Kubernetes’ pod disruption budgets (PDBs) and scheduling constraints. It can be set up using autodiscovery or through manual methods. In this article, we’ll review these concepts and demonstrate both setup processes step by step.

Summary of key Kubernetes Cluster Autoscaler concepts #

Concept	Description
What is the Cluster Autoscaler?	The CAS is the standard Kubernetes node group-centric autoscaling tool that automatically adjusts the number of nodes in a cluster.
How does the CAS work?	The Cluster Autoscaler automatically tunes the number of nodes in a Kubernetes cluster based on resource usage, scaling up to accommodate unschedulable pods and scaling down unneeded nodes after a grace period.
Key features	The CAS features pod-conscious scaling, the ability to leverage expanders for multiple node groups, and respecting Kubernetes’ pod disruption budgets and scheduling constraints.
Setting up the CAS	The CAS can be set up using autodiscovery or manually; the choice determines the appropriate IAM roles. The setup itself includes creating a YAML manifest with the appropriate configuration and applying it to the cluster.
Key limitations	The Cluster Autoscaler faces challenges in the areas of scaling granularity, node group limitations, and performance overhead.
CAS alternatives	While the CAS is a reliable tool for managing resources in Kubernetes clusters, it may not have all the advanced features of newer autoscalers like Karpenter. Karpenter offers more sophisticated capabilities, including faster scaling, improved resource utilization, and better integration with diverse workloads.

What is the Cluster Autoscaler? #

The CAS is designed to automatically adjust the size of a Kubernetes cluster, adding or removing nodes based on the resource needs of workloads. It scales out nodes when it detects unschedulable pods. This system focuses on maintaining just enough capacity to handle workload demands without manual intervention. Node autoscaling is one of the three pillars of Kubernetes autoscaling, along with vertical and horizontal pod autoscaling.

The Cluster Autoscaler works by monitoring the pending pods in the cluster. If there are pods that cannot be scheduled due to insufficient resources, it scales up the cluster. Conversely, it scales down when nodes are underutilized to save costs.

This tool is particularly useful in dynamic environments with fluctuating workloads, helping maintain the balance between performance and cost efficiency. The Cluster Autoscaler simplifies operations and enhances scalability by automating resource management.

How does the Cluster Autoscaler work? #

The CAS looks for pods that can’t be scheduled, effectively monitoring the resource usage of your Kubernetes cluster. When it detects that pods cannot be scheduled due to insufficient resources, like memory or CPU, it automatically adds new nodes. This ensures that applications have the resources they need to run smoothly.

Conversely, if the Cluster Autoscaler identifies underutilized nodes, it removes them. This helps optimize resource usage and reduce costs. The process involves migrating pods from underutilized nodes before shutting them down.

How scale-up works

The CAS continuously watches the API server for unschedulable pods, checking every 10 seconds by default. A pod is considered unschedulable when the Kubernetes scheduler cannot find a node with sufficient resources to accommodate it, as indicated by the “schedulable” pod condition being set to false. When such pods are found, the Cluster Autoscaler attempts to find or create new nodes that can host them.

The autoscaler operates under the assumption that all machines within a node group have identical capacities and labels. Scaling up creates a new node similar to the existing ones in the node group, which initially hosts no user-created pods but includes node manifests and DaemonSets. It creates template nodes for each node group to simulate if the unschedulable pods can fit on a new node, using a simplified process that may require multiple iterations.

Node creation speed depends on the cloud provider and the provisioning process, including TLS bootstrapping. The Cluster Autoscaler expects new nodes to register within 15 minutes; otherwise, it stops considering them and may attempt to scale up a different node group. This way, the CAS quickly adjusts to changing resource needs while minimizing delays in pod scheduling.

Autonomous Rightsizing for Kubernetes Workloads

Learn More

Automated vertical autoscaling designed to scale for 100K+ containers

Fully compatible with HPA functionality and cloud-based services

How scale-down works

The CAS checks for unneeded nodes every 10 seconds if no scale-up is required; the check interval can be configured via the --scan-interval flag. A node is considered unneeded if all of the following default requirements are met:

Resource usage: The node’s CPU and memory usage are below 50% of its allocatable resources. (configurable by --scale-down-utilization-threshold)
Pod movability: All of the node’s pods (excluding DaemonSet pods) can be moved to other nodes.
Annotations: The node does not have a scale-down disabled annotation.

If a node remains unneeded for over 10 minutes, it will be terminated. This interval is configurable, and the autoscaler only terminates one non-empty node at a time in order to minimize disruption. Empty nodes can be terminated in bulk, up to 10 at a time, which is also configurable.

When a non-empty node is terminated, its pods are drained, and the node is cordoned to prevent rescheduling. DaemonSet pods can be configured for eviction on both empty and non-empty nodes using specific flags. This careful process ensures efficient scaling down while maintaining cluster stability.

Key Cluster Autoscaler features #

Pod-conscious scaling

While the CAS is considered a node-centric autoscaler, it does rely on pod scheduling mechanics. The CAS algorithm ensures enough capacity for all pods in the cluster to be scheduled, which serves as a primary metric for scaling. Additionally, it aims to confirm that no underutilized nodes are in the cluster.

Leveraging expanders for multiple node groups

When the Cluster Autoscaler detects unschedulable pods, it decides which node group to expand by using expanders, which determine the strategy for selecting the appropriate node group for scaling. You can specify the desired expander using the --expander flag, which provides different strategies for optimizing node selection.

The Cluster Autoscaler offers several expanders, each of which features unique advantages. The default expander is random, suitable when no specific node group needs priority. Other expanders include most-pods for maximizing pod scheduling, least-waste for efficient resource utilization, least-nodes for minimizing node count, price for cost efficiency, and priority for user-defined preferences.

Starting with version 1.23.0, multiple expanders can be used together. This allows you to create a hierarchy of expanders where the output of one feeds into the next. For example, combining priority and least-waste can produce optimal scaling decisions based on both user priorities and resource efficiency.

Pod safety and scheduling constraints

The Cluster Autoscaler respects Kubernetes pod disruption budgets (PDBs) when scaling down nodes, making sure that critical pods are not disrupted. PDBs define the maximum number of concurrent disruptions allowed, protecting essential workloads from being interrupted.

The autoscaler also considers pod priority and preemption settings to determine which pods can be safely rescheduled or disrupted. This ensures that high-priority pods remain operational while low-priority ones are considered for eviction.

Stop Setting Kubernetes Requests and Limits

Learn How

Setting up and configuring the Cluster Autoscaler #

There are two primary ways to set up the Cluster Autoscaler: autodiscovery and manual. We will go through each of them step by step.

Autodiscovery setup

To enable autodiscovery, you need to tag your autoscaling groups (ASGs) with specific key-value pairs that the Autoscaler recognizes. This setup simplifies scaling by automatically identifying which groups to scale based on the tags.

The autodiscovery setup is particularly useful for dynamic environments where autoscaling groups may change frequently. By relying on tags, the CAS can automatically adapt to new groups or configurations without requiring manual intervention.

To enable autodiscovery in Cluster Autoscaler, follow these steps.

1. Create an IAM role

The IAM role includes all the permissions to list, describe, and manage nodes in an autoscaling group.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "autoscaling:DescribeAutoScalingGroups",
        "autoscaling:DescribeAutoScalingInstances",
        "autoscaling:DescribeLaunchConfigurations",
        "autoscaling:DescribeScalingActivities",
        "ec2:DescribeImages",
        "ec2:DescribeInstanceTypes",
        "ec2:DescribeLaunchTemplateVersions",
        "ec2:GetInstanceTypesFromInstanceRequirements",
        "eks:DescribeNodegroup"
      ],
      "Resource": ["*"]
    },
    {
      "Effect": "Allow",
      "Action": [
        "autoscaling:SetDesiredCapacity",
        "autoscaling:TerminateInstanceInAutoScalingGroup"
      ],
      "Resource": ["*"]
    }
  ]
}

2. Tag your ASGs

Using the autodiscovery option requires adding tags. Tag your ASGs as follows:

k8s.io/cluster-autoscaler/enabled: ""
k8s.io/cluster-autoscaler/<CLUSTER_NAME>: ""

These tags tell the autoscaler which groups are part of your Kubernetes cluster and should be considered for scaling operations.

3. Install CAS

Download the autodiscovery manifest file from the CAS github repo, and edit it as shown below:

Note: It's recommended that you use the same minor version of Cluster Autoscaler as your Kubernetes version. In the deployment named cluster-autoscaler (line 145), change the image version to reflect the minor version of the Kubernetes cluster that you’re using. For this example below, we are running Kubernetes version 1.30 so we’ll use v1.30.0 of CAS.

 containers:
        - image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.30.0

In the deployment named cluster-autoscaler (line 165), change the –node-group-auto-discovery line by substituting your cluster name into <YOUR CLUSTER NAME>

- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>

Install CAS by using this command:

kubectl apply -f cluster-autoscaler-autodiscovery.yaml

Manual setup

The manual setup option for Cluster Autoscaler involves explicitly specifying the names of the autoscaling groups you want to manage. This method requires you to list the desired groups in the Autoscaler’s deployment configuration.

The manual setup is ideal for static environments where autoscaling groups are stable and unlikely to change frequently. This way, only specific groups are managed by the Autoscaler, providing precise control over scaling operations. It offers more control over which groups are included but requires more maintenance effort compared to the autodiscovery approach, especially in dynamic environments.

Here are the steps to follow.

1. Create an IAM Role

For the manual setup, the IAM role would look a bit different than we saw earlier:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "autoscaling:DescribeAutoScalingGroups",
        "autoscaling:DescribeAutoScalingInstances",
        "autoscaling:DescribeLaunchConfigurations",
        "autoscaling:DescribeScalingActivities",
        "autoscaling:SetDesiredCapacity",
        "autoscaling:TerminateInstanceInAutoScalingGroup",
        "eks:DescribeNodegroup"
      ],
      "Resource": ["arn:aws:autoscaling:${YOUR_CLUSTER_AWS_REGION}:${YOUR_AWS_ACCOUNT_ID}:autoScalingGroup:*:autoScalingGroupName/${YOUR_ASG_NAME}"]
    }
  ]
}

2. Install CAS

Download the Multi-ASG manifest file from the CAS github repo, and edit it as shown below:

containers:
  - image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.30.0

3. Specify node groups

Define your autoscaling groups in the Cluster Autoscaler configuration. To configure the manual setup, you need to pass the --nodes flag with the format minNodes:maxNodes:ASGName in the Cluster Autoscaler command line. This flag defines the minimum and maximum node count for each specified autoscaling group.

Here is how it looks in the command line:

./cluster-autoscaler --cloud-provider=aws --nodes=1:10:<ASG_NAME_1> --nodes=2:20:<ASG_NAME_2>

4. Prepare deployment and service account manifests

Install CAS by using this command:

kubectl apply -f cluster-autoscaler-multi-asg.yaml

Cluster Autoscaler challenges and limitations #

Scaling granularity

The Cluster Autoscaler may face challenges with scaling granularity, leading to potential issues such as overprovisioning or underprovisioning resources. When scaling up, the Cluster Autoscaler might add more nodes than necessary, resulting in unused capacity and increased costs. Conversely, during scale-down operations, it might not remove enough nodes, leaving the cluster with excess resources that are not optimally utilized.

Granular scaling decisions can be complex due to varying workloads and unpredictable demand patterns. The CAS uses predefined thresholds to determine when to add or remove nodes, but these thresholds may not always align perfectly with real-world application needs. As a result, finding the right balance between performance and cost efficiency can be challenging.

To address these granularity issues, it is crucial to configure the Cluster Autoscaler with accurate scaling parameters and limits. Regularly monitoring cluster performance and adjusting thresholds based on observed workload patterns can help improve scaling decisions. This proactive approach certifies that the Cluster Autoscaler operates efficiently while maintaining the desired level of resource utilization.

Node group limitations

The Cluster Autoscaler may encounter limitations when managing diverse node groups with varying instance types. Different workloads require different types of nodes, such as high CPU, high memory, or GPU-enabled nodes. However, the CAS may struggle to effectively balance resource allocation across these heterogeneous groups.

One of the challenges is ensuring that the right node group is selected for scaling based on the specific needs of the unschedulable pods. This requires precise configuration and understanding of each node group’s capabilities and limitations. In some cases, the Cluster Autoscaler may not fully utilize the diverse resources available, leading to resource allocation inefficiencies.

To mitigate these node group limitations, organizations should carefully plan their node group configurations and scaling policies. Grouping similar workloads and defining clear scaling priorities can help the Autoscaler make more informed decisions. By aligning node group management with application requirements, it can optimize resource allocation and improve overall cluster performance.

Performance overhead

Using the Cluster Autoscaler may introduce performance overhead that affects the efficiency of the cluster. It continuously monitors resource usage and makes scaling decisions, which can consume computational resources and network bandwidth. In clusters with a high number of workloads, this monitoring process might lead to increased latency and resource contention.

Another potential overhead comes from the time required to scale nodes up or down, particularly in large clusters. Adding or removing nodes is not instantaneous and can introduce delays, impacting the ability to respond quickly to sudden changes in demand.

To minimize performance overheads, it’s essential to optimize the configuration of the Cluster Autoscaler. Adjusting the frequency of resource checks and tuning scaling thresholds can help reduce unnecessary computations. Additionally, ensuring that scaling actions are well-aligned with workload demands can minimize delays and improve responsiveness.

Cluster Autoscaler vs. Karpenter #

While the Cluster Autoscaler offers a reliable way to manage resources in Kubernetes clusters, it might not fully address some advanced autoscaling features available in modern solutions. Newer autoscalers, like Karpenter, provide more sophisticated capabilities, such as faster scaling, improved resource utilization, and better integration with diverse workloads. While we have a separate comparison of Karpenter and Cluster Autoscaler, here is a summary of the major differences.

Ease of setup

While the Cluster Autoscaler itself is easier to set up, it doesn’t cover node provisioning, which needs to be configured separately. Specifically, managing node groups involves creating, tweaking, and tuning them outside Kubernetes objects. This makes managing the full setup of CAS and node management with Terraform or other IaC tools more complex and less handy to maintain than Karpenter.

Advanced capabilities missing in the Cluster Autoscaler

There are a few more advanced features missing on the CAS side:

Consolidation: The Cluster Autoscaler cannot efficiently consolidate nodes, meaning it doesn’t automatically optimize node usage by rebalancing workloads across existing nodes. Karpenter, however, automatically consolidates nodes, reducing resource waste and cost.
Using spot instances: The CAS doesn’t easily support using spot instances for cost savings, which limits its ability to leverage the most cost-effective cloud resources dynamically. Karpenter has prioritized easily incorporating spot instances into your cluster, offering significant cost reductions while maintaining performance.
Flexibility without node groups: The reliance on predefined node groups restricts flexibility because the Cluster Autoscaler cannot dynamically allocate resources without the constraints of node groups. Karpenter does not require node groups, allowing for more flexible and responsive scaling based on real-time demands.
Node start time: The Cluster Autoscaler may experience delays in node provisioning, making its response time slower than that of solutions designed for quicker node start times. Karpenter is designed for rapid node provisioning, significantly reducing the time required to scale up resources when needed.

Experience StormForge in a sandbox – no email required

Access Sandbox

Conclusion #

The Cluster Autoscaler is traditionally a go-to tool for automatically adjusting Kubernetes cluster size based on resource needs. It ensures applications have the necessary resources by monitoring pending pods and scaling up or down accordingly. The CAS’s key features include resource-conscious scaling, leveraging expanders for multiple node groups, and respecting Kubernetes’ PDBs and scheduling constraints.

In this article, we outlined how to install CAS in autodiscovery and manual setup. Even in autodiscovery mode, CAS requires all NodeGroups to be available, which means created and managed by external automation. The manual method makes it even more difficult because you must track those NodeGroups and add them to the CAS configuration.

An alternative way to overcome these obstacles might be to consider Karpenter, the next-generation Kubernetes node autoscaler. In our next article, we outline a comprehensive comparison between CAS and Karpenter.

Solution	Rightsizing recommendations	Automation	Fully compatible with HPA	Powered by machine learning	Historical metrics analysis	Trend forecasting
VPA
StormForge

Kubernetes Cluster Autoscaler