|
|
Kubernetes Persistent Volumes
Author: Venkata Sudhakar
Containers are ephemeral - when a pod restarts, all data written to its filesystem is lost. Kubernetes Persistent Volumes (PV) solve this by providing storage that outlives a pod. A PersistentVolume is a piece of storage provisioned by an administrator or dynamically by a StorageClass. A PersistentVolumeClaim (PVC) is a request for storage by a pod - it specifies how much storage is needed and the access mode required. Kubernetes binds the PVC to an available PV that satisfies the request, and the pod mounts the PVC like a regular volume. In cloud environments you rarely create PVs manually. Instead you use a StorageClass which provisions cloud storage automatically when a PVC is created - for example a GKE StorageClass backed by Google Persistent Disk or an EKS StorageClass backed by AWS EBS. This is called dynamic provisioning. When the PVC is created, Kubernetes calls the cloud provider API, creates the disk, and binds it automatically. StatefulSets use PVC templates to give each pod its own dedicated persistent volume that follows the pod across rescheduling. The below example shows creating a PVC with dynamic provisioning and mounting it into a PostgreSQL pod so the database survives pod restarts.
Apply and verify,
kubectl apply -f postgres-pvc.yaml
persistentvolumeclaim/postgres-pvc created
deployment.apps/postgres created
# Check PVC status - Bound means a PV was provisioned and attached
kubectl get pvc -n production
NAME STATUS VOLUME CAPACITY ACCESS MODES
postgres-pvc Bound pvc-abc123-def456-789xyz 10Gi RWO
# The underlying GCP Persistent Disk was created automatically
# If the postgres pod restarts, data in /var/lib/postgresql/data is preserved
kubectl delete pod postgres-abc123 # pod restarts, PVC reattaches, data intact
It gives the following output,
kubectl get pvc
NAME STATUS CAPACITY
kafka-data-kafka-0 Bound 50Gi <- pod kafka-0 owns this PVC
kafka-data-kafka-1 Bound 50Gi <- pod kafka-1 owns this PVC
kafka-data-kafka-2 Bound 50Gi <- pod kafka-2 owns this PVC
# Each Kafka broker has its own dedicated disk
# If kafka-1 is rescheduled to a different node, its PVC follows it
Access modes determine how many nodes can mount the volume simultaneously: ReadWriteOnce (RWO) means one node at a time - suitable for databases. ReadWriteMany (RWX) means multiple nodes can mount simultaneously - requires a network filesystem like NFS or GCS Fuse. Most block storage (EBS, GCP Persistent Disk) only supports RWO. Use RWX only for shared file storage like uploaded documents or ML model artifacts.
|
|