Managing Resources in Kubernetes¶
In a multi-tenant environment like the Contain Platform, effective resource management is crucial for ensuring stability, performance, and fair resource distribution among all users. Kubernetes provides a powerful set of tools to control how much CPU and memory your applications consume. This guide covers the core concepts of requests, limits, quotas, and pod priority.
Requests and Limits¶
When you define a Pod, you can specify how much CPU and memory (RAM) each
container needs. These are known as requests and limits.
requests: This is the amount of resources that Kubernetes guarantees for your container. When the scheduler places a pod on a node, it ensures that the node has enough available capacity to meet the sum of the requests of all pods scheduled on it.limits: This is the maximum amount of resources a container is allowed to use. If a container exceeds its memory limit, it will be terminated. If it exceeds its CPU limit, it will be throttled (its CPU usage will be capped).
On the Contain Platform, it is a mandatory policy that all containers must have CPU and memory requests and limits defined. This is enforced by an admission controller and is essential for maintaining the stability and health of the cluster.
Example¶
Here is a Pod specification with requests and limits defined:
spec:
containers:
- name: my-app
image: my-app:1.0
resources:
requests:
memory: "64Mi"
cpu: "250m" # 250 millicores (0.25 of a core)
limits:
memory: "128Mi"
cpu: "500m" # 500 millicores (0.5 of a core)
For more detailed information, see the official Kubernetes documentation on how to Assign Memory Resources to Containers and Pods and Assign CPU Resources to Containers and Pods.
Resource Quotas¶
While requests and limits apply to individual containers, a ResourceQuota
provides constraints on the aggregate resource consumption within a single
namespace. This prevents any one project or team from using more than its fair
share of cluster resources.
A ResourceQuota can limit the total sum of compute resources (requests and
limits) that can be used by all pods in a namespace, as well as the total number
of Kubernetes objects (like Pods, Services, or Secrets) that can be
created.
Your namespace on the Contain Platform comes with a predefined ResourceQuota.
You can view your quota's limits and current usage with kubectl get resourcequota.
For more details, see the official Kubernetes documentation on Resource Quotas.
Pod Priority and Preemption¶
In some situations, you may need to specify that certain workloads are more important than others. For example, a critical back-end service that serves multiple front-end applications may be more important than the front-ends themselves. If the cluster is running low on resources, you need a way to tell Kubernetes which pods it can evict (preempt) to make room for more critical ones.
Kubernetes uses an object called a PriorityClass for this purpose. A
PriorityClass has a numeric value, and pods with a higher value are considered
higher priority. When the scheduler cannot find a node to run a new,
higher-priority pod, it can preempt (evict) lower-priority pods from a node to
make room.
Kubernetes itself uses PriorityClasses to ensure that critical system
components remain running. The Contain Platform extends this mechanism to provide
a stable operational environment and to allow you to prioritize your own
applications.
Available Priority Classes¶
We have created the following PriorityClasses for you to use with your
applications. They are, in order of highest to lowest priority:
secure-cloud-stack-tenant-namespace-application-criticalsecure-cloud-stack-tenant-namespace-application-less-criticalsecure-cloud-stack-tenant-namespace-application-lesser-criticalsecure-cloud-stack-tenant-namespace-application-non-critical
If you do not specify a priorityClassName for your pods, the default value of
secure-cloud-stack-tenant-namespace-application-non-critical will be
automatically assigned.
Configuring an Application to use a Priority Class¶
You can assign a PriorityClass to your application's pods by setting the
priorityClassName field in the pod specification.
The following example shows a Deployment for a critical back-end service:
apiVersion: apps/v1
kind: Deployment
metadata:
name: a-customer-critical-deployment
labels:
app.kubernetes.io/name: back-end-deployment
spec:
replicas: 2
selector:
matchLabels:
app.kubernetes.io/name: back-end-deployment
template:
metadata:
labels:
app.kubernetes.io/name: back-end-deployment
spec:
# Short grace period for faster preemption if needed
terminationGracePeriodSeconds: 10
priorityClassName: "secure-cloud-stack-tenant-namespace-application-critical"
containers:
- image: nginxinc/nginx-unprivileged:1.20
name: back-end-deployment
resources:
requests:
memory: 990M
cpu: 5m
limits:
memory: 990M
ports:
- containerPort: 8080
name: http
Preemption and Graceful Termination¶
When a pod is preempted, it receives a termination signal. The default graceful
termination period for a pod is 30 seconds. If you want to ensure that
lower-priority pods are preempted faster to make room for critical workloads,
you can set a shorter terminationGracePeriodSeconds on the lower-priority pods.
Sidecars and Operators
Be aware that if your application uses sidecar containers (e.g., from a
service mesh) or is managed by an operator that injects pods, these
additional components must also have the correct priorityClassName set.
Otherwise, they may be assigned the default, non-critical priority, which
could lead to unexpected behavior where a critical application's sidecar is
preempted.
For a complete understanding of this topic, please refer to the official Kubernetes documentation on Pod Priority and Preemption.