Skip to content

Managing Resources in Kubernetes

In a multi-tenant environment like the Contain Platform, effective resource management is crucial for ensuring stability, performance, and fair resource distribution among all users. Kubernetes provides a powerful set of tools to control how much CPU and memory your applications consume. This guide covers the core concepts of requests, limits, quotas, and pod priority.

Requests and Limits

When you define a Pod, you can specify how much CPU and memory (RAM) each container needs. These are known as requests and limits.

  • requests: This is the amount of resources that Kubernetes guarantees for your container. When the scheduler places a pod on a node, it ensures that the node has enough available capacity to meet the sum of the requests of all pods scheduled on it.
  • limits: This is the maximum amount of resources a container is allowed to use. If a container exceeds its memory limit, it will be terminated. If it exceeds its CPU limit, it will be throttled (its CPU usage will be capped).

On the Contain Platform, it is a mandatory policy that all containers must have CPU and memory requests and limits defined. This is enforced by an admission controller and is essential for maintaining the stability and health of the cluster.

Example

Here is a Pod specification with requests and limits defined:

spec:
  containers:
  - name: my-app
    image: my-app:1.0
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m" # 250 millicores (0.25 of a core)
      limits:
        memory: "128Mi"
        cpu: "500m" # 500 millicores (0.5 of a core)

For more detailed information, see the official Kubernetes documentation on how to Assign Memory Resources to Containers and Pods and Assign CPU Resources to Containers and Pods.

Resource Quotas

While requests and limits apply to individual containers, a ResourceQuota provides constraints on the aggregate resource consumption within a single namespace. This prevents any one project or team from using more than its fair share of cluster resources.

A ResourceQuota can limit the total sum of compute resources (requests and limits) that can be used by all pods in a namespace, as well as the total number of Kubernetes objects (like Pods, Services, or Secrets) that can be created.

Your namespace on the Contain Platform comes with a predefined ResourceQuota. You can view your quota's limits and current usage with kubectl get resourcequota.

For more details, see the official Kubernetes documentation on Resource Quotas.

Pod Priority and Preemption

In some situations, you may need to specify that certain workloads are more important than others. For example, a critical back-end service that serves multiple front-end applications may be more important than the front-ends themselves. If the cluster is running low on resources, you need a way to tell Kubernetes which pods it can evict (preempt) to make room for more critical ones.

Kubernetes uses an object called a PriorityClass for this purpose. A PriorityClass has a numeric value, and pods with a higher value are considered higher priority. When the scheduler cannot find a node to run a new, higher-priority pod, it can preempt (evict) lower-priority pods from a node to make room.

Kubernetes itself uses PriorityClasses to ensure that critical system components remain running. The Contain Platform extends this mechanism to provide a stable operational environment and to allow you to prioritize your own applications.

Available Priority Classes

We have created the following PriorityClasses for you to use with your applications. They are, in order of highest to lowest priority:

  • secure-cloud-stack-tenant-namespace-application-critical
  • secure-cloud-stack-tenant-namespace-application-less-critical
  • secure-cloud-stack-tenant-namespace-application-lesser-critical
  • secure-cloud-stack-tenant-namespace-application-non-critical

If you do not specify a priorityClassName for your pods, the default value of secure-cloud-stack-tenant-namespace-application-non-critical will be automatically assigned.

Configuring an Application to use a Priority Class

You can assign a PriorityClass to your application's pods by setting the priorityClassName field in the pod specification.

The following example shows a Deployment for a critical back-end service:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: a-customer-critical-deployment
  labels:
    app.kubernetes.io/name: back-end-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app.kubernetes.io/name: back-end-deployment
  template:
    metadata:
      labels:
        app.kubernetes.io/name: back-end-deployment
    spec:
      # Short grace period for faster preemption if needed
      terminationGracePeriodSeconds: 10
      priorityClassName: "secure-cloud-stack-tenant-namespace-application-critical"
      containers:
      - image: nginxinc/nginx-unprivileged:1.20
        name: back-end-deployment
        resources:
            requests:
              memory: 990M
              cpu: 5m
            limits:
              memory: 990M
        ports:
        - containerPort: 8080
          name: http

Preemption and Graceful Termination

When a pod is preempted, it receives a termination signal. The default graceful termination period for a pod is 30 seconds. If you want to ensure that lower-priority pods are preempted faster to make room for critical workloads, you can set a shorter terminationGracePeriodSeconds on the lower-priority pods.

Sidecars and Operators

Be aware that if your application uses sidecar containers (e.g., from a service mesh) or is managed by an operator that injects pods, these additional components must also have the correct priorityClassName set. Otherwise, they may be assigned the default, non-critical priority, which could lead to unexpected behavior where a critical application's sidecar is preempted.

For a complete understanding of this topic, please refer to the official Kubernetes documentation on Pod Priority and Preemption.