Search
Close this search box.
Kubernetes FAQ and health check

Kubernetes FAQ and health check

Last year, we received several requests to conduct evaluations on Kubernetes clusters. Because Fairbanks supports all major Kubernetes distributions, the clusters we evaluated were based on upstream Kubernetes, OKD, OpenShift, and as well as Rancher. The positive outcomes of the health checks on the Kubernetes clusters led us to create a free Kubernetes cluster health check. With these health checks, clients can meet our Kubernetes experts and discuss possible improvements or fine-tuning aspects that relate to their Kubernetes cluster.  

To write this blog, we interviewed one of our Kubernetes experts to learn more about the evaluation of Kubernetes clusters. Note that the execution of the health checks is different depending on the distribution and used versions. 

Are you interested in conducting a Kubernetes health check on your Kubernetes cluster? You can request your free health check through the following short form: https://fairbanks.nl/kubernetes-health-check/  

However, if you want to read more about Kubernetes and our health check, feel free to keep on reading.  

What distribution should I choose and why? 

Fairbanks supports all major container distributions and for the last couple of years we have learned a lot about why different companies choose different distributions. For instance, which distribution you choose fully depends on your organization, the knowledge level of your teams and the way the technology will be used within the company. If you choose open source Kubernetes, the options to which components you can use are very broad, in contradiction to solutions like OpenShift and Rancher where the distributor (Red Hat or SUSE) generally makes these choices for you.  

When we evaluate a container infrastructure, it is important to keep in mind what the differences are between an open source and a distributed solution. For example, when health checking a Kubernetes cluster, we look at which tooling is used and whether the right choices in that have been made. That said, with Kubernetes, you have more freedom of choice, but that can backfire as well when wrong design decisions are made. On the other hand, when we conduct a health check on a container infrastructure such as OpenShift, OKD or Rancher we tend to look at the implementation and configuration more than the tooling.  

What version should I use? 

When it comes to upgrading, some say, they patch on one side but break something on the other side. Therefore, we usually recommend the second-last version when it comes to OpenShift, OKD and Rancher. The second-last version usually has been implemented and used already, therefore, bugs that might occur will most likely already be solved, while in the newest version things like unexpected bugs might not have been solved yet.  

For example, for OpenShift the recommended version is currently (April 2023) version 4.10. Furthermore, before upgrading to a new version, we advise to always look at the various components of that version. This goes for Kubernetes as well as for other distributions. 

How do we execute a Kubernetes health check on a Kubernetes cluster?  

Currently, Fairbanks is developing standardized tooling to do an automated health check on Kubernetes clusters. This tooling is comparable with the 42on Ceph-Collect tooling which we use to get diagnostical and infrastructural data from a Ceph cluster. Through this tooling we are able to get a basic idea about improvements, tooling and/or capacity management. For more information about Ceph-Collect tooling, you can read the following blog: https://www.42on.com/how-we-use-ceph-collect-to-work-with-you/  

As our Kubernetes health check tooling is not completely finished yet, the health check is currently a manual process where we look at things like the overall infrastructure configurations and whether there are error notifications. Even though a Kubernetes cluster can continue running with errors, these are usually the first thing to start with, when conducting a health check, in order to make sure those errors will not cause any further issues to the infrastructure. In addition, we will look at the individual nodes that have been configured, whether the nodes are in a good state, whether the nodes are full and why. We also look at whether the workloads can be managed, which microservices are present and what the relationship or impact of them is on the Kubernetes infrastructure. Besides the afore mentioned, we will look at your capacity management. This includes checking how many processor cycles and memory is used.  

For example, this is one of the commando’s we run during the health check: 

Deprecated API that will be removed in the next EUS version is being used. Removing the workload that is using the discovery.k8s.io.v1beta1/endpointslices API might be necessary for a successful upgrade to the next EUS cluster version. Refer to `kubectl get apirequestcounts endpointslices.v1beta1.discovery.k8s.io -o yaml` to identify the workload. 

What are some common Kubernetes cluster mistakes?  

The most common mistake that we see and that impacts a Kubernetes cluster is a poor configuration. Sometimes we see that architectural decisions are not ideal for the use case, this means that Kubernetes is running in a less efficient way than it should. 

That said we have also seen wrong usage of Kubernetes. For instance, going from virtual machines to containers is not necessarily and always the best option. More so, you should only go to containers if it will benefit you or your company and not just because it is the next best thing out there.  

Tips for your Kubernetes cluster

Occasionally, when we conduct a Kubernetes health check the company in question requests a general consultation with tips about what architecture to use, how to use the infrastructure or how to further improve a Kubernetes cluster. 

When it comes to general tips for your Kubernetes cluster, we tend to say: ‘Just get started, build it, break it and fix it’. Another tip from our Kubernetes expert is to visit www.cncf.io . Here you can learn a lot about which products are in development, which products are compatible with Kubernetes (open source and distributed versions) and much, much more.  

We are hiring!
Are you our new