Exploring Goldilocks: ‘Just Right’ Resource Management

Managing resource requests and limits in Kubernetes can be challenging, especially for teams that are new to container orchestration or scaling complex workloads. But without proper configuration, your cluster can become unstable, experience resource contention (we call that the noisy neighbor effect), or drive up cloud costs unnecessarily. This is why we created Goldilocks, an open-source tool that helps you get your resource requests and limits just right.

Recently, I joined Whitney Lee, a CNCF Ambassador who enjoys understanding and using tools in the cloud native landscape, for a really fun discussion with her on her lightboard streaming show ⚡️ Enlightning on YouTube about how Goldilocks helps teams set the resource requests and limits “just right” for their workloads. It was such a great conversation because Whitney really dug into how Goldilocks was created, what it does, how it works, and the benefits it brings to Kubernetes environments, so I thought I’d write up some of the highlights — but you should absolutely go watch the video too!

The Wild West of Resource Requests and Limits

Before we introduced Goldilocks, sometimes it seemed like setting resource requests and limits in Kubernetes was a little bit like shooting in the dark. Many teams would either:

Not set any resource requests or limits at all, leading to unpredictable performance as containers competed for resources on the node, causing application instability and inefficiencies.
Set arbitrary limits based on examples they found online, which were often (well, probably never) not tailored to their actual workloads, resulting in under-utilized or over-provisioned resources.

These approaches led to a range of problems, from wasted resources and increased cloud costs to unpredictable performance during peak loads. These approaches also lack any kind of predictability in terms of understanding what you need for your workloads, so you have to start from scratch every time. For companies running critical workloads, inappropriately set requests and limits could result in a significant impact on both user experience and operational costs.

"You could have a container in which the application has a memory leak or maybe one container or one pod is getting a lot more traffic than other pods. So it tries to consume an increasing amount of resources on the node. And then other pods don't have access to what they need. And then you might have your pods getting evicted or your containers being killed."

Visualizing Resource Recommendations

We developed Goldilocks as a response to these challenges that we saw with our Managed Kubernetes-as-a-Service clients. The tool is a visualizer for Kubernetes resource requests and limits that helps teams see what the optimal configuration should look like. Goldilocks is strictly giving you a starting point for where to set your requests and limits for CPU and memory. It provides recommendations based on real data, enabling teams to set more accurate and effective resource requests and limits.

How Does Goldilocks Work?

Goldilocks is deployed as a Helm chart and runs as a controller instead of as custom resource definitions (CRDs). It works by leveraging another open-source Kubernetes tool, the Vertical Pod Autoscaler (VPA). VPA isn’t built into Kubernetes, you have to install it. VPA installs its own set of CRDs, and it essentially has three parts to it.

The Recommender: takes metrics that it gets from some endpoint. That can be metrics-server by default, or you can set up Prometheus and get metrics from there. It uses those metrics to make a recommendation for what it thinks your pod should be set to for your requests and your limits. Then it interacts with the Updater.
The Updater: responsible for evicting pods that don't fall within the range that the Recommender said it should. Essentially, if the Recommender determines that a pod should have a certain amount of CPU, and the pod’s specification doesn’t match that then the pod will be evicted to update its resources. In order to change a pod, you have to delete it and redeploy it. The updater is responsible for that. The controller for that pod, the StatefulSet or the Deployment, will find that there are fewer replicas than it’s supposed to have and it will try to create a new pod, which is where the Admission Controller comes into play. There’s a flag called update mode, which you can set to Auto, allowing it to automatically evict pods. There’s also Initial, where it only sets this during the very first deployment of the pod. And there’s Off, which does not affect the pods at all.
The Admission Controller: if you have it set to Update Mode Auto, when the new pod is created, the Admission Controller will mutate that pod and set the request and limits according to what the Recommender said. VPA in auto-update mode can also have conflicts with the Horizontal Pod Autoscaler (HPA), which will oftentimes scale on the same metrics as VPA. Enabling VPA to make its own changes on the fly makes a lot of people nervous, and it makes it difficult to have your desired state stored in GitOps.

Goldilocks uses the information from the VPA to display a recommended starting point, so a request for your memory and your CPU and a recommended endpoint or a limit for memory and CPU. It also separates those recommendations into two different quality of service (QoS) families for Kubernetes: Burstable (which allows pods to flex their resource usage), and Guaranteed (which means a pod is less likely to be evicted under resource pressure).Goldilocks, by default, only uses the Recommender component, providing non-disruptive recommendations without altering the state of the cluster. This makes it a safe starting point for most teams who want to explore and understand their resource usage before implementing changes.

Benefits of Using Goldilocks

Improved Resource Allocation: By setting accurate resource requests and limits, your Kubernetes workloads can perform better under varying loads, reducing the risk of resource contention or under-utilization.
Cost Savings: With Goldilocks, you can identify and eliminate inefficiencies in resource allocation, such as over-provisioned memory or CPU, leading to more cost-effective cloud spend.
Enhanced Stability: Properly set resource requests and limits result in more stable workloads, reducing the likelihood of unexpected pod evictions or node-level resource exhaustion.
Data-Driven Decisions: Goldilocks relies on actual usage metrics to generate its recommendations, so you’re not making arbitrary decisions based on guesswork or outdated or irrelevant examples.

Getting Started with Goldilocks

It’s simple to get started with Goldilocks. You can deploy it as a Helm chart in your Kubernetes cluster. Once installed, Goldilocks will start analyzing your workloads and provide you with recommendations for setting your resource requests and limits in a simple dashboard.

To use Goldilocks, you’ll want to:

Install Goldilocks and deploy the Helm chart in your Kubernetes cluster.
Enable the Recommender so Goldilocks can begin gathering data.
Use the Goldilocks dashboard to visualize and review the suggested requests and limits for each workload.

Goldilocks is a useful open source tool for Kubernetes resource management. It simplifies the process of setting resource requests and limits, making it easier for teams to achieve the “just right” balance. With its intuitive visualizer and data-driven recommendations, Goldilocks takes the guesswork out of resource allocation, leading to more efficient, stable, and cost-effective Kubernetes workloads.

Whether you’re a small team just starting out or an enterprise managing complex cloud environments, Goldilocks is a good addition to your Kubernetes toolkit. Try it out and see how it can help you optimize your resources and improve the performance of your applications.

Want to Learn More?

Check out the full Enlightening discussion with Whitney Lee for more fun and more in-depth insights into Goldilocks. You can also explore the Goldilocks GitHub repo to get started!