Throttling, QoS, and PriorityClass: finding out what is happening to your pods in Kubernetes

Vinícius Loureiro
6 min readFeb 16, 2023

--

Photo by Growtika on Unsplash

When dealing with Kubernetes, one of the most common problems is finding out what’s wrong with your pods and why they are not running. It could be due to many reasons, but there is one that can take you a lot of time if you are looking in the wrong place: starvation.
Starvation in Kubernetes is a state where your cluster resources aren’t enough to give what you are asking for and the scheduler cannot introduce any new workload. To make sure your running workloads don’t stop any time soon, you must fix it as soon as possible and you can start by investigating these 3 things: Throttling, Kubernetes Quality of Service (QoS), and PriorityClasses.

Throttling:

The basic idea behind throttling is to slow down the processing of whatever is its request, like network bandwidth or CPU, for example. What I mean by slowing down these things is: if you’re downloading something from the internet but something with a higher priority shows up, one of the approaches you could take to still download that thing and get the thing with higher priority done is to somehow get the packets of your provider at the same maximum speed they were coming but with larger spaces between them (something that would take an hour to complete would probably take a little more). The same thing works for processors (CPU).
In the container world, we can define computer resources as something that can be requested, allocated, and then consumed by the container. These resources are normally classified as compressible (they can be throttled, like CPUs) and incompressible (they can’t be throttled, such as memory).

Once we know how throttling works, let’s understand why your pods aren’t running and how throttling can be causing this.
We’ve seen that CPUs can be throttled (because we can slow down a program processing), but memory can’t and this might be the reason why your pods aren’t up. Memory can’t be throttled because there aren’t any means to ask for an application to stop using that memory and make it available for us, so if you have a pod that is running and consuming almost all your nodes’ memory (supposing you have just a few nodes), your new pods won’t be scheduled and that can be the root cause of your problems.

There are some ways to deal with this problem:

  1. Use resources like LimitRange or ResourceQuota to deal with pods going out of their limits;
  2. Set concise requests and limits in your containers if they don’t have any;
  3. Upgrade your nodes’ capability;

Quality of Service (QoS):

Whenever you define requests or limits to your containers, Kubernetes will provide you a QoS, and they’ll variable according to how you set up these limits and requests. There are 3 kinds of QoS, let’s dive into them:

  • Guaranteed: as the name says, pods here have the highest priority among all the other pods. To create a pod with this QoS, they must have the same value for request and limit.
  • Burstable: here, if a pod has both requests and limits defined, they have different values and limits are higher than requests, then this pod has minimal resources granted and can reach its limit if the resources are available. One important thing to mention here is that if a node is running out of incompressible resources and there is no other best-effort pod to be terminated, then a burstable pod will be the next target.
  • Best-Effort: this QoS defines that if a pod doesn’t have any request or limit defined, this pod is treated as a low-priority pod, and as soon as the node runs out of incompressible resources, the pod will be terminated.

This could be the reason why your pods aren’t running or are suddenly being stopped. What I recommend here is taking a look if the request and limit defined in your pod deserve a higher priority or if other pods can be adjusted to stop preventing your new pod from being created. You can also use the tips given in the throttling section, if necessary.

PriorityClass:

As we’ve seen by now, the declaration of resources in a pod can impact its life or even the life of other pods, but this can get a little tricky if you’re working in an environment with thousands of pods and you need a better way to organize their priority, that’s where the resource PriorityClass comes in.
The resource is a declarative way of defining your pods’ priorities and how they will be scheduled, and they also are namespace independent. Here’s how to define it:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority
value: 10000
globalDefault: false
description: "high priority pod class, must be used for X pod only"
---
apiVersion: v1
kind: Pod
metadata:
name: high-priority-pod
spec:
containers:
- image: nginx:latest
name: nginx
priorityClassName: high-priority

Let’s see some fields of this resource definition:

  • Name: it works just like any other Kubernetes resource, except that here the name must be a valid DNS subdomain name and can’t be prefixed with “system-”.
  • Value: 32-bit integer value smaller than or equal to 1 billion. The higher the value, the higher the priority.
  • GlobalDefault: indicates if the value defined in this PriorityClass should be used for pods without a PriorityClass defined.
  • Description: optional value that indicates when users of the cluster should use this PriorityClass.
  • PriorityClassName (under the pod manifest): it is an optional field but if you want to define a specific PriorityClass for this pod, you must set the name of the PriorityClass here.

Now, let’s understand how Kubernetes manages pods with this property. Initially, the admission controller will use the “priorityClassName” to fill the priority value of new pods. Whenever you’re creating new pods, the scheduler will sort the queue to accommodate the pods with higher priorities first. If pod X has a priority value of 100 and pod Y has a priority value of 1000, pod Y will be chosen to be scheduled first if there is no other restriction.
But, what if there isn’t any node with enough capacity to receive our pod? In this case, a removal (preemption) will be made to lower-priority pods to free space for higher-priority pods. This approach allows cluster administrators to better define which workloads must run first, leaving the responsibility of removing lower-priority pods to the scheduler/Kubelet. If a pod can’t be scheduled, pods with lower priority will still be scheduled.

QoS x Pods Priority:

It’s important to know that, although the QoS and Pods Priority might look like the same thing or that they have a connection, they don’t. The QoS is mostly used by Kubelet to keep nodes healthy when their resources are about to go empty. It first considers the QoS and then the PriorityClass in pods before doing an eviction.
On the other side, the scheduler eviction process considers only the pods’ priority class before choosing a target to preempt, it will choose one or a group of pods with lower priority that can give the space necessary to allocate your new pod.

Conclusion:

These are the things you might need to consider when investigating why your workloads aren’t running or why they’re being stopped. It’s also important to mention that, these things might or might not have something to do with the problems you’re facing, that’s why I also highly recommend you check other “more visible” resources, like ResourceQuotas, LimitRanges, and your PodDisruptionBudget.
Another thing that is also important to take care of is: to be careful when using Pods Priorities because since you’re dealing with numeric properties, users have the possibility of using a higher value for a pod that probably doesn’t need that priority.

--

--

Vinícius Loureiro

I do DevOps for a living and I write some stuff here. Cloud, Golang and Kubernetes enthusiast.