It's not uncommon for me to hear a prospective client say something like this: "With only 15 people, how can you deliver a good, fast, and reliable DevOps-as-a-Service solution for our organization?"
The answer is threefold:
1) We hire senior engineers with tons of experience.
2) We bring a significant amount of automation to the table.
3) We have substantial experience running Kubernetes at scale for many companies.
I want to break down the three parts of this answer to provide a better sense of how our small team is consistently able to deliver big results.
As a small organization, we made a decision early on to hire only the very best people we could find. I know a lot of organizations say they do the same, but our results bear us out. As a small DevOps shop, we have made the hiring and retention of world-class expert engineers our top priority -- and our top differentiator. It's extremely hard for engineers to make it through our hiring pipeline. As a result, when we do hire someone, they are fantastic engineers.
In hiring engineers with strong Ops backgrounds, we get engineers with significant depth and breadth in two essential areas. First, when one of our engineers re-architects an infrastructure from the ground up, they understand the implications of the decisions they’re making. And not just for the short term. They’re able to recognize when decision "foo" or "bar" may feel insignificant in the short term but will likely be a huge pain point in the long term if it isn’t navigated in a strategic and specific way.
We don't want you to have to re-architect your infrastructure due to insufficient planning. Instead, we want to build it right today. The experienced engineers we hire make that possible.
Second, significant experience brings a unique depth of troubleshooting skills. All infrastructure engineers will at some point hit a wall when digging into a problem and have to ask lots of questions to figure out the root cause of that problem. The more senior the engineer, the greater the depth of their troubleshooting skills. They’ve seen many different kinds of problems, and therefore have many different questions to ask to diagnose the problem. They’ll ask, "Well if it isn't this one thing, then maybe it's that other thing I saw a few years back." This depth makes it much more difficult to hit a dead end.
Our people have seen infrastructure break in a myriad ways, and their collective experience helps us build great infrastructure and identify and resolve complicated ongoing maintenance problems quickly.
When we started Fairwinds, we initially just did straight consulting. Our clients told us what they wanted done, and we helped them achieve those goals as rapidly and effectively as possible. However, we quickly realized everyone's baseline needs were roughly the same. Everyone wants zero-downtime deployments, a fully automated deployment pipeline, autoscaling, monitoring, alerting, logging, …. (The list is long, but the general list is similar for most everyone.)
Some companies have addressed this challenge by building a PaaS, but we didn't want to build Yet Another PaaS (YAP?). Let’s say you decide to hire the best site reliability engineer you can find. That engineer would likely take a look at your existing setup as well as the best tools available and then build your infrastructure from the ground up. We do the same thing. We're going to build you something with best-in-class open source and SaaS tooling (from Kubernetes to CircleCI). The difference is that we've done it so many times that we've automated the process. And with that automation we can deliver speed and reliability.
We never have to start from ground zero. Better yet, we’re constantly tweaking and improving the process. This automation includes the way we stand up a Kubernetes cluster with our in-house open source tool Pentagon, as well as how we interact with Kubernetes clusters and our CI pipeline with rok8s-scripts.
It's one thing to stand up a cluster and get an application running in it. It's another thing altogether to stand behind those clusters for the long term. Few companies out there offer the managed services around Kubernetes that we offer at Fairwinds. What does that mean for our clients? It means that we know what kinds of scaling and networking issues tend to break clusters. It means we know how to navigate the complexities of security. It means we can harden a cluster to be production ready.
Because we know Kubernetes inside and out, we know what kinds of things can go wrong -- and how to address those problems before they arise. In maintaining Kubernetes for each client, we learn invaluable lessons that benefit every one of our clients. In other words, we regularly integrate these lessons into our tooling to stand up new clusters. Our solid, knowledge-backed process is constantly evolving and always improving.
For example, let’s say you have an internal team of 20 people. These 20 people may be able to work diligently on Kubernetes and build the same level of expertise and repeatable, reliable processes that we can. However, by doing this work for many different kinds of clients, Fairwinds has developed an unparalleled level of expertise around what it takes to build Kubernetes-based infrastructure and run it at scale in production.
You can build, manage and troubleshoot your own platform. However, the truth is that we can build it faster, customize it to your specific needs and maintain it better than just about anyone else out there. And your platform won’t have the bugs most in-house systems have.
At the outset, some clients aren’t sure how our small team can accomplish so much. However, after we work with a client, they get it. The usual response once their platform is up and running? "Wow, Fairwinds has only 15 people, and you delivered such a fantastic DevOps solution to us so fast!"
That’s the value of having senior engineers, significant automation and experience running Kubernetes at scale for many companies -- you get lean, smart, proven DevOps-as-a-Service solutions. We may be small, but we deliver big.