As we’ve discussed in this series, there are some things that you should simply never, ever do in Kubernetes. Corey Quinn, founder of Screaming in the Cloud and Chief Cloud Economist at The Duckbill Group, Kendall Miller, President of Fairwinds, and Stevie Caldwell, Technical Lead (CRE Team) at Fairwinds had a great conversation about some of the basic mistakes they’ve seen, as well as some of the most common Kubernetes security, reliability, and efficiency problems they’ve encountered over the last several years helping customers deploy Kubernetes. During the webinar we had a number of questions from the attendees, which we’re happy to share with you. Let’s get to your top four FAQs about what development and operations teams should never ever do in Kubernetes if they want to get the most out of the leading container orchestrator.
The answer is a hard no. It’s better to scale down the size of the nodes for the control plane and call it good. This goes back to the “don't run workloads on your control plane” piece. There's a reason that the best practice for this is to keep them separate. Usually these sorts of things are spurred on by cost concerns, but there are other better and safer ways to run a cluster and control your costs than trying to combine your nodes all into one. You're already taking a middle ground by having your etcd
cluster also co-located with the control plane. (Some Kubernetes professionals say that you should run etcd
externally from the control plane.) If you decrease the size of your control plane nodes, you can right-size resources for the worker nodes so that you're not using too much, and you should make use of auto scaling to scale down your cluster. There are all sorts of things that you can do to be economical about running Kubernetes that aren’t unsafe, but don’t run combined worker and control plane nodes.
The data store should be behind an API, and what is powering that data store is really an implementation question. If you're going to run your database on Kubernetes, you need to have a plan for volume, persistence, backups, restores, and so on. While the idea isn't actively ridiculous, you do need to architect for it. Consider what happens if a container goes away and doesn't get replaced for a little while, and that’s where your database is running?
If there are managed solutions offered within the environment that you're running in, it’s worth looking at those. For example, if you're running your workload on AWS, using their data store offerings, such as Aurora or RDS is a lot easier than rolling your own. It’s going to add a lot of complexity, so why do it if you don't have to? One of the premises that Kubernetes was built upon is the idea that all the data is ephemeral. That’s why things move around as freely as they do, and why it doesn't matter if you lose a node or two. But when we start getting into persistent data issues, as you do with databases, then you have to plan accordingly:
When you run a database in Kube, it becomes very complex. So you need to think about what gain you're going to get out of doing so. It might increase flexibility for some things, but for other things it doesn’t make sense, so consider your use case carefully, because it will add a lot of complexity to run your database in Kubernetes.
For example, if you have one name space for each client and want to calculate the monthly and for class for each name space, how do you do that? We have a tool, Fairwinds Insights, that does relative daily cost and breaks it down by service. It also tells you which namespace those things are in and helps you optimize your resource requests, identifying what you're requesting versus what you could be requesting. Fairwinds Insights also provides data that shows you how much you’re going to save over time.
There are other tools out there that do something similar. CloudZero, for example, is a cloud cost intelligence platform. Kubecost focuses specifically on the costs of Kubernetes. There’s another company that VMware acquired, CloudHealth, which provides some capabilities around optimizing cost and usage across cloud environments. Cloudability from Apptio helps teams optimize cloud resources for speed, cost, and quality.
People who care about Kubernetes workloads typically have two questions that are hard to find good answers for. One is tracking shared resources; for things that have to exist, how do you allocate and attribute those? The other question is related to data transfer, and there is no great tooling option available today. For example, if you are using AWS and you have two services talking to each other, that data transfer is either free, or if they're in two availability zones for durability purposes, it costs two cents (2¢) per gigabyte. When we're talking about petabyte scale workloads, that's a lot of money.
The short answer is yes — we now have a self-hosted version of Fairwinds Insights. Fairwinds Insights ingests data from Polaris, as well as nearly a dozen other Kubernetes audits (such as Trivy, Goldilocks, and kube-bench), and puts all the results in a single pane of glass. Since launching our SaaS offering, we found that some Polaris users had concerns about shipping data off to a third-party, especially enterprises in data-sensitive industries such as healthcare and finance. To address these concerns, we’ve worked hard for the past few months to build a version of Insights that can run entirely within the customer’s environment. We recently posted an article walking through our new self-hosted Fairwinds Insights, which explains how it works and some of the pros and cons for using an on prem version. If you're interested in looking at Fairwinds Insights, check out the Fairwinds Insights demo.