Running Stateful Applications¶
Many applications, such as databases, message queues, and key-value stores, are
stateful. Unlike stateless applications that can be easily replaced, stateful
applications require stable, persistent storage and unique, predictable network
identifiers. This guide explains how to run such applications on the platform
using Kubernetes StatefulSets and ensure their resilience.
Conceptual Overview
This is a practical, step-by-step guide. For a deeper understanding of the
underlying Kubernetes storage concepts like PersistentVolumes,
PersistentVolumeClaims, and StorageClasses, please see the Persistent
Storage in
Kubernetes
documentation.
Using StatefulSets for Stable Storage and Identity¶
To provide persistent storage for each pod, the primary tool is the Kubernetes
StatefulSet. Unlike a Deployment, a StatefulSet maintains a sticky, unique
identity for each of its pods. This provides several key features for stateful
applications:
- Stable, unique network identifiers (e.g.,
web-0,web-1). - Stable, persistent storage linked to each pod's identity.
- Ordered, graceful deployment and scaling.
- Ordered, automated rolling updates.
This makes them ideal for applications requiring dedicated, persistent storage per pod.
Disk Size Modifications
Note that disk size for a StatefulSet's volumeClaimTemplates cannot be
extended by simply changing the value in the manifest. This is a manual
process that is handled by the Netic Cloud Native team.
See also the official Kubernetes StatefulSet documentation.
Example: Deploying a Stateful Application¶
The following example creates a StatefulSet with two replicas. Each pod will
get its own PersistentVolumeClaim based on the volumeClaimTemplates section,
providing it with 1Gi of dedicated storage.
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 2 # by default is 1
minReadySeconds: 10 # by default is 0
template:
metadata:
annotations:
backup.velero.io/backup-volumes: www
labels:
app: nginx # has to match .spec.selector.matchLabels
netic.dk/network-ingress: "contour"
spec:
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: nginxinc/nginx-unprivileged:1.20
ports:
- containerPort: 8080
name: http
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- all
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "vsphere-sc"
resources:
requests:
storage: 1Gi
To see if the pods are running and have their unique, ordered identities:
➜ kubectl get pods -n my-namespace
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 28m
web-1 1/1 Running 0 29m
Backing Up Persistent Volumes¶
By default, the data in your PersistentVolumeClaim (PVC) is not backed up.
To enable backups for the persistent volumes used by your pod, you must add a
specific annotation to your pod's template.
This annotation tells the Velero backup service to include the specified volume in its scheduled backups.
<pvc-name>: Replace this with the name of thePersistentVolumeClaimyou want to back up. In ourStatefulSetexample above, this iswww, which corresponds to thenamein thevolumeClaimTemplatessection.- Multiple PVCs: If your pod has multiple volumes, you can back them all up
by providing a comma-separated list (e.g.,
backup.velero.io/backup-volumes: data-pvc,config-pvc).
Ensuring High Availability with PodDisruptionBudgets¶
While your application now has persistent state, you also need to ensure it remains available during voluntary disruptions, such as when a cluster node is drained for maintenance.
By implementing a PodDisruptionBudget (PDB), you can define a minimum number
of pods that must remain available at all times, safeguarding against
disruptions caused by these maintenance activities.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: nginx-pdb
spec:
minAvailable: 1
selector:
matchLabels:
app: nginx
With this PDB in place, a node drain will not proceed if it would violate the
budget. For example, during a node drain involving a node running web-1, the
pod will only be terminated once web-0 is running and healthy on another node
(or if it's unaffected).
Do Not Use PDBs with a Single Replica
You should not have a PodDisruptionBudget if you only have one pod running
for your application. A PDB with minAvailable: 1 on a single-replica
application will prevent the node from ever being drained successfully, as
the budget can never be met. This can halt cluster upgrades and maintenance.