Kubernetes rightsizing is the process you use to ensure that your Kubernetes cluster has the right amount of resources to run your workloads efficiently. K8s rightsizing includes CPU, memory, and storage, and it’s important to get right. It can be expensive to run Kubernetes, so you need to make sure you are not over-provisioning resources (and wasting money). On the flip side, if your K8s cluster doesn’t have enough resources, your workloads will suffer, leading to longer response times and more errors. It could even result in outages and downtime for your apps and services, which won’t make your end users very happy.
Gartner predicts that, by 2027, more than 70% of enterprises will use industry cloud platforms (ICPs) to accelerate their business initiatives (we typically refer to them as Internal Developer Platforms). In 2023, it was less than 15%, so adoption of cloud native technologies, including containers and Kubernetes, will increase very rapidly over the next few years. The time to understand Kubernetes rightsizing is now, before you’re already over-provisioning and overspending.
To do Kubernetes rightsizing right, you need to understand a few key things:
To truly get a good understanding of your workloads, you need the right data. Ideally, collect metrics over an appropriate amount of time in order to make informed decisions. Here’s some of the data you’ll need:
You can collect this information using a variety of tools; here are few that can help you:
Cloud spend is a significant cost for many organizations, so it makes sense to ensure that resources are allocated as efficiently as possible. Still, it can be challenging to get all the data needed to make informed decisions. It’s often a real challenge to get visibility into your Kubernetes environment at this level, particularly if you have multiple teams, multiple clusters, and multiple clouds at play. That’s why we created Goldilocks. It provides a dashboard that uses the Kubernetes vertical-pod-autoscaler in recommendation mode to provide a suggestion for resource requests for each of your apps by creating a VPA for each workload in a namespace and then querying them for information.
Resource requests and limits allow you to specify the minimum and maximum amount of resources that your workloads can request, which can help you prevent over-provisioning and ensure that your workloads have the resources they need to perform effectively. You should also use monitoring tools on an ongoing basis to track resource usage over time so you can identify workloads that are over- or under-provisioned. Combine this with cluster autoscaling to automatically scale your Kubernetes cluster up or down based on demand, so you only use resources when you need them.
Something to note — Goldilocks is generally a good starting point for setting your resource requests and limits, but every environment is different, and you will still have to fine-tune your applications for your individual use cases. Here’s how to get started:
Because rightsizing is so important to your Kubernetes infrastructure, it’s considered a best practice to configure your resource requests and limits for all containers. Polaris is an open source policy engine that helps you validate and remediate Kubernetes deployments to ensure that configuration best practices are being followed. Otherwise, it’s just too easy for someone to deploy an app that doesn’t meet best practices for reliability, cost efficiency, and security.
Automated policy management solutions can analyze your workload and cluster data to identify workloads that are over- or under-provisioned. Automating the process of identifying rightsizing opportunities can help you to save time and resources. When workloads are over-provisioned, Kubernetes may scale up more than needed, while under-provisioned workloads will run out of memory or become CPU constrained (resulting in performance degradation, increased errors, reduced throughput, or outages). You can also monitor your resource usage over time and adjust your rightsizing policies as needed. This can help you to ensure that your Kubernetes workloads continue to be rightsized, even as workloads and clusters change.
Kubernetes compute costs can be significant, especially for organizations deploying multiple clusters to production. Rightsizing can help to reduce these costs by ensuring that each workload has the resources it needs, but not more. Rightsizing can also improve performance by ensuring that each workload has the resources it needs to run efficiently, keeping your apps and services humming along.