As the complexity of Kubernetes environments grow, costs can quickly spiral out of control if an effective strategy for optimization is not in place. We’ve compiled expert recommendations and best practices for running cost-optimized Kubernetes workloads on AWS, Microsoft Azure, and Google Cloud (GCP).

Gaining Complete Kubernetes Cost Visibility

Gaining visibility into your container cost and usage data is the first step to controlling and optimizing Kubernetes costs. Visibility is critical at each level of your Kubernetes deployment:

Clusters

Nodes

Pods (Namespaces,  Labels, and Deployments)

Containers

You will also want visibility within each business transaction. Having deep visibility will help you:

  1. Avoid cloud “bill shock” (a common compelling incident where stakeholders find out after-the-fact that they have overspent their cloud budget)
  2. Detect anomalies
  3. Identify ways to further optimize your Kubernetes costs

For example, when using Kubernetes for development purposes, visibility helps you identify Dev clusters running during off-business hours so you can pause them. In a production environment, visibility helps you identify cost spikes originating from a deployment of a new release, see the overall costs of an application, and identify cost per customer or line of business.

Kubernetes Deployment

Detecting Kubernetes Cost Anomalies

“Bill shock” is too common an occurrence for businesses that have invested in Kubernetes. Anomaly detection intelligence will continuously monitor your usage and cost data and automatically and immediately alert relevant stakeholders on your team so they can take corrective action.

Anomalies can occur due to a wide variety of factors and in many situations. Common anomaly causes include:

  • A new deployment consuming more resources than a previous one
  • A new pod being added to your cluster
  • Suboptimal scaling rules causing inefficient scale-up
  • Misconfigured (or not configured) pod resource request specifications (for example, specifying GiB instead of MiB)
  • Affinity rules causing unneeded nodes to be added

Save your team the pain of end-of-month invoice shock. Any organization running Kubernetes clusters should have mechanisms for K8s anomaly detection and anomaly alerting in place — full stop.

New call-to-action

Optimizing Pod Resource Requests

Have organizational policies in place for setting pod CPU and memory requests and limits in your YAML definition files. Once your containers are running, you gain visibility into the utilization and costs of each portion of your cluster: namespaces, labels, nodes, and pods. This is the time to tune your resource request and limit values based on actual utilization metrics.

Kubernetes allows you to fine-tune resource requests with granularity up to the MiB (RAM) and a fraction of a CPU, so there is no reason to overprovision and end up with low utilization of the allocated resources.

Kubernetes namespace

Node Configuration 

Node cost is driven by various factors, many of which can be addressed at the configuration level. These include the CPU and memory resources powering each node, OS choice, processor type and vendor, disk space and type, network cards, and more. 

When configuring your nodes:

  • Use open-source OSes to avoid costly licenses like those required for Windows, RHEL, and SUSE
  • Favor cost-effective processors to benefit from the best price-performance processor option:
    • On AWS, use Graviton-powered instances (Arm64 processor architecture)
    • In GCP, favor Tau instances powered by the latest AMD EPYC processors
  • Pick nodes that best fit your pods’ needs. This includes picking nodes with the right amount of vCPU and memory resources, and a ratio of the two that best fits your pod’s requirements.
    • For example, if your containers require resources with a vCPU to memory ratio of 8, you should favor nodes with such a ratio, like:
      • AWS R instances
      • Azure Edv5 VMs
      • GCP n2d-highmem-2 machine types
    • In such a case, you will have specific nodes options per pod with the vCPU and memory ratio needed.

Processor Selection

For many years, all three leading cloud vendors offered only Intel-powered compute resources. But, recently, all three cloud providers have enabled various levels of processor choice, each with meaningful cost impacts. We have benefited from the entry of AMD-powered (AWS, Azure, and GCP) and Arm architecture Graviton-powered instances (AWS).

These new processors introduce ways to gain better performance while reducing costs. In the AWS case, AMD-powered instances cost 10% less than Intel-powered instances, and Graviton instances cost 20% less than Intel-powered instances. To run on Graviton instances, you should build multi-architecture containers that comply with running on Intel, AMD, and Graviton instance types. You will be able to take advantage of reduced instance prices while also empowering your application with better performance. 

Purchasing Options

Take advantage of cloud provider purchasing options. All three leading cloud providers (AWS, GCP, Azure) offer multiple purchasing strategies, such as:

  • On-Demand: Basic, list pricing
  • Commitment-Based: Savings Plans (SPs), Reserved Instances (RIs), and Commitment Use Discounts (CUDs), which deliver discounts for pre-purchasing capacity
  • Spot: Spare cloud service provider (CSP) capacity (when it is available) that offers up to a 90% discount over On-Demand pricing

Define your purchasing strategy choice per node, and prioritize using Spot instances when possible to leverage the steep discount this purchasing option provides. If for any reason Spot isn’t a fit for your workload — for example, in the case that your container runs a database — purchase the steady availability of a node that comes with commitment-based pricing. In any case, you should strive to minimize the use of On-Demand resources that aren’t covered by commitments. 

Autoscaling Rules

Set up scaling rules using a combination of horizontal pod autoscaling (HPA), vertical pod autoscaling (VPA), the cluster autoscaler (CA), and cloud provider tools such as the Cluster Autoscaler on AWS or Karpenter to meet changes in demand for applications.

Scaling rules can be set per metric, and you should regularly fine-tune these rules to ensure they fit your application’s real-life scaling needs and patterns.

Kubernetes Scheduler (Kube-Scheduler) Configuration

Use scheduler rules wisely to achieve high utilization of node resources and avoid node overprovisioning. As described earlier, these rules impact how pods are deployed. 

In cases such as where affinity rules are set, the number of nodes may scale up quickly (e.g., setting a rule for having one pod per node). 

Overprovisioning can also occur when you forget to specify the requested resources (CPU or memory) and instead, only specify the limits. In such a case, the scheduler will seek nodes with resource availability to fit the pod’s limits. Once the pod is deployed, it will gain access to resources up to the limit, causing node resources to be fully-allocated quickly, and causing additional, unneeded nodes to be spun up

Managing Unattached Persistent Storage

Persistent storage volumes have an independent lifecycle from your pods, and will remain running even if the pods and containers they are attached to cease to exist. Set a mechanism to identify unattached EBS volumes and delete them after a specific period has elapsed.

Persistent shortage

Optimizing Network Usage to Minimize Data Transfer Charges

Consider designing your network topology so that it will account for the communication needs of pods across availability zones (AZs) and can avoid  added data transfer fees. Data transfer charges may also happen when pods communicate across AZs with each other, with the control plan, load balancers, and with other services. 

Another approach for minimizing data transfer costs is to deploy namespaces per availability zone (one per AZ), to get a set of single AZ namespace deployments. With such an architecture, pod communication remains within each availability zone, preventing data transfer costs, while allowing you to maintain application resiliency with a cross-AZ, high-availability setup.

Minimizing Cluster Counts

When running Kubernetes clusters on public cloud infrastructure such as AWS, Azure, or GCP, you should be aware that you are charged per cluster.

In AWS, you are charged $73 per month per cluster you run with Amazon Elastic Kubernetes Service (EKS). Consider minimizing the number of discreet clusters in your deployment to eliminate this additional cost.

Next Steps 

Now that you have a better understanding of Kubernetes cost optimization strategies, it’s time to implement best practices for maximizing your Kubernetes ROI. 

Optimize: Leverage intelligent recommendations to continuously optimize Kubernetes costs and usage

After enabling appropriate visibility across all your stakeholders, you and your FinOps team can finally take on the task of optimizing and reducing Kubernetes spending. With comprehensive K8s visibility, you can fine-tune Kubernetes resource allocation — allocating the exact amount of resources required per cluster, namespace/label, node, pod, and container. 

Operate: Formalize accountability and allocation for Kubernetes costs 

As a FinOps strategy leader, you must gain consensus and instill proper financial control structures for Kubernetes within your organization. FinOps strategies without accountability and alignment are doomed to failure. Financial governance controls further reduce the risk of overspending and improve predictability. This operating phase is where the rubber meets the road as far as what results you will gain from your Kubernetes FinOps efforts.

Learn details on these strategies to maximize K8s ROI here

Anodot for Kubernetes Cost Optimization 

Anodot provides granular insights about your Kubernetes deployment that no other cloud optimization platform offers. Easily track your spending and usage across your clusters with detailed reports and dashboards. Anodot’s powerful algorithms and multi-dimensional filters enable you to deep dive into your performance and identify under-utilization at the node level. 

Kubernetes Costs

With Anodot’s continuous monitoring and deep visibility, engineers gain the power to eliminate unpredictable spending. Anodot automatically learns each service usage pattern and alerts relevant teams to irregular cloud spend and usage anomalies, providing the full context of what is happening for the fastest time to resolution.

cloud cost alert

Anodot seamlessly combines all of your cloud spend into a single platform so you can optimize your cloud cost and resource utilization across AWS, GCP, and Azure. Transform your FinOps, take control of cloud spend and reduce waste with Anodot’s cloud cost management solution. Getting started is easy! Book a demo to learn more. 

 

Written by Anodot

Anodot leads in Autonomous Business Monitoring, offering real-time incident detection and innovative cloud cost management solutions with a primary focus on partnerships and MSP collaboration. Our machine learning platform not only identifies business incidents promptly but also optimizes cloud resources, reducing waste. By reducing alert noise by up to 95 percent and slashing time to detection by as much as 80 percent, Anodot has helped customers recover millions in time and revenue.

Start Reducing Cloud Costs Today!

Connect with one of our cloud cost management specialists to learn how Anodot can help your organization control costs, optimize resources and reduce cloud waste.