February 23, 2023
Performance tuning of Kubernetes is becoming more critical as Kubernetes becomes more prevalent. Kubernetes a highly extensible, open-source platform for orchestrating containerized workloads in a server environment.
It is also minimally opinionated by default and enables both declarative configuration and automation. This flexibility allows Kubernetes to cover many different use cases and has made it the most dominant container orchestration engine.
As server costs can increase rapidly, and sometimes unexpectedly, in cloud systems, it is imperative to find ways to maximize your infrastructure utilization and reduce costs rather than blindly ‘scaling’ up in response to environmental demands. Kubernetes is a powerful orchestration tool, and with this power comes the responsibility to correctly configure the system to operate in your best interest. In this article, we give you ten Kubernetes performance tuning tips and best practices for setting up your environment to squeeze every bit of efficiency and performance out of your Kubernetes-managed system.
Optimized images are an easy win when it comes to performance tuning your Kubernetes cluster. Containerized apps that were built to run on virtual servers (VMs) include overhead that aren’t necessary in a container environment. A container-optimized image will greatly reduce your container image size and this then lets Kubernetes retrieve that image faster and run the resultant running container more efficiently.
A container-optimized image, should:
Now that you have a finely tuned image, it is time to help Kubernetes schedule that image efficiently onto appropriate nodes and one of the best optimization tips from experts is that you have to be more specific with resource constraints. What this means is that you can define the requests and put limits on CPU, memory, and other resources.
Say you decide to use a Go-based microservice as an email server, so might designate the resource profile below:
resources: requests: memory: 1Gi cpu: 250m limits: memory: 2.5Gi cpu: 750m
If the same application is implemented in another language, those limits would likely change. A Java-based app would probably perform best with twice the memory limits your Go app version used. Those little differences in what each app needs can build up to measurable performance deltas when scaled out over hundreds or thousands of nodes.
It is worth mentioning that what CPU and memory mean will depend on the options your service provider offers. For server environments, we can argue about Intel’s Xeon or AMD CPUs. If you are concerned about getting high read/write throughput, DDR4 RAM chips, if available, might be an optimal choice. Most service providers will give you a (possibly overwhelming) set of options in choosing CPU or memory-rich nodes to match to your specific needs. By defining your Kubernetes resource profile to match your application needs, the K8S scheduler will be able to assign new resources to the most appropriate available node to maximize runtime performance.
As we’ve just mentioned, not all nodes have similar hardware capabilities. In most cases, the Kubernetes scheduler does a great job to make sure the right node is selected by checking the node’s capacity for CPU and RAM and comparing it to the Pod’s resource requirements. Still, specifying pod placement can help with performance tuning, and Kubernetes Node Affinity and Pod Affinity allow you to define where your pods are deployed.
As an example, you would like to pair CPU-intensive applications with your most performant infrastructure. You have two available node types: one with cpuType=32core that gives high frequency and CPU core count, the other with memoryType=ddr4 that delivers fastest, highest memory available. To assure that Kubernetes scheduler makes the desired pairing happen, you would use the nodeSelector with the appropriate label in the spec section:
… nodeSelector: cpuType: 32core
Another option is to use the nodeAffinity of the affinity field in the spec section. Here you have two options:
requiredDuringSchedulingIgnoredDuringExecution: An absolute rule. The scheduler will deploy the pods and limit it only to specified nodes.
preferredDuringSchedulingIgnoredDuringExecution: A preference rather than a rule. The scheduler will attempt deployment to the specified nodes if available or else it will schedule deployment on the next available node.
It is further possible to control the node label with syntax like NotIn, In, Exists, and DoesNotExist, etc. However, even with the ability to craft intricate syntax statements, getting carried away has a real possibility of impacting performance, contrary to what we are trying to achieve here.
As mentioned above, Kubernetes enables you to change and rechange the Pod affinity configurations in terms of current running pods. Quite simply, you can let specific pods run along with various pods in the same cluster or zone of nodes, allowing them to frequently communicate with each other.
The available fields under the podAffinity of the affinity field in the spec section are the same as the ones for nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution and preferredDuringSchedulingIgnoredDuringExecution.
The only glaring disparity is that the matchExpressions will deploy pods to a node that already has a running pod with a similar label.
Besides, Kubernetes provides a podAntiAffinity field with an opposite function compared to the one above: it will not schedule a pod into a node that contains specific pods.
And in closing about podAffinity, just as for nodeAffinity expressions: keep the rules simple and logical, with a handful of labels to match. You can easily impact performance by making Kubernetes sort too many labels.
In other cases, it is better to make sure that Kubernetes is not deploying certain containers to specific nodes. This is the role of “Taints” that really act as the opposite of the Affinity rules we just discussed. They provide Kubernetes with the rules that prevent things from happening. Like not permitting a certain set of nodes to be scheduled to certain zones or nodes. To apply taints to a certain node, you have to apply the taint option with kubectl. You have to specify the key and value part, after which you follow it up with a taint effect like NoExecute or NoSchedule:
kubectl taint nodes backup1=backups-only:NoSchedule
You could later remove that taint
kubectl taint nodes backup1=backups-only:NoSchedule-
Or you provide an exception for certain pods by including a “toleration” in the PodSpec. This could be useful when you have a node that you tainted so that nothing is scheduled on it, but now you want to schedule backup jobs and nothing else. You could schedule backups on the tainted nods by adding the following fields in the Pod Spec:
spec: tolerations: - key: "backup1" operator: "Equal" value: "backups-only" effect: "NoSchedule"
Since this matches the tainted node, any pod with that spec will be able to be deployed in the node backup1.
One final word of caution. While taints and tolerations do give operators very fine-grained control over performance, there is a cost in the effort required to initially configure them.
While defining where a pod should be deployed does define what is deployed to a given node, there are times when the order in which pods or deployed matters. The Kubernetes PriorityClass gives a way to define and enforce that order.
The first step is to create a pod, for example, PriorityClass:
apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: highPriority value: 1000000 globalDefault: false description: "This priority class should be used for initial node function validation."
You are not limited to the number of priority classes you create. For the sake of clarity, creating the minimum needed is a best practice (e.g. high, medium, low). A high priority pod is assigned a higher value number. You can add a priorityClassName under the Pod spec:
Kubernetes uses a feature gate framework that allows administrators to enable or disable environment features. Feature gates are a set of key=value pairs that describe Kubernetes features. You can turn these features on or off using the –feature-gates command line flag on each Kubernetes component. A number of these features can boost scaling performance, making them worthy of a deeper look and evaluation:
Setting resource limits (memory and CPU) also helps with Kubernetes performance tuning. You can verify that memory overcommit flags are set to the following default node system settings:
These settings instruct the kernel to always overcommit memory and, should the kernel run out of memory, not to panic but rather have the kernel OOM killer kill processes based on priority.
For CPU resources, kubelet config controls the pods-per-core through a specific setting that limits the maximum number of pods per core per node. So a two core node with a setting of ten pods per core could only run up to twenty pods (2 cores x 10 pods).
Etcd is the brain, or at least memory, of Kubernetes, in the form of a distributed key=value database. When possible, deploying etcd clusters and the kube-apiserver close will keep latency to a minimum. This is also a place where deploying to nodes with solid state disks (SSDs) with low I/O latency and high throughput will further optimize your database performance.
We’ve all experienced and been frustrated by network latency. Placing Kubernetes nodes as close to the end-user as possible can greatly improve your customer’s experience. Most cloud providers have multiple geographic zones around the world that allow systems operators to match a Kubernetes cluster deployment to where their end users are located and thus keep latency as low as possible.
It is crucial that you have a concrete plan for managing Kubernetes clusters in multiple zones before deployment. Bear in mind that there are limitations per provider on which zones can be utilized that offer the ideal failure tolerations. For instance, if you are using Microsoft Azure, this particular set of zones are assigned to Azure Kubernetes Service (AKS). Google Kubernetes Engine (GKE) provides choices for regional or multiple zone clusters, each having its own set of advantages and disadvantages related to proximity, redundancy, and cost.
There is nothing wrong with deploying locally and then adjusting your deployment strategy based on feedback. Doing this effectively is dependent on having a monitoring system (next tip) in place to identify bottlenecks before customers even notice the impact on Service Level Objectives (SLOs).
Ok. This is not a strictly Kubernetes tip, but you don’t want to wait to hear that your Kubernetes cluster is performing poorly from a customer. This is a place that it is easy to be proactive and get private, quantitative feedback on system performance. The open-source Prometheus monitoring solution, often paired with the open-source Grafana dashboard, readily integrates with Kubernetes.
Metrics, as opposed to the logs that are generated by a specific event, allow you to keep tabs on the overall performance of your Kubernetes-managed system over time. You can further define alerts to match your SLOs so that when performance degrades, you can effectively take action to fix the problem.