Volume Basics

Containers are ephemeral. Volumes provide data sharing and persistence so that Pods can keep state, exchange files between containers, and mount configuration safely.

This quick start expands on the basics with practical examples and selection guidance that map to real workloads.

Why volumes matter

Containers restart frequently, but data often must survive.
Multiple containers in one Pod need a shared filesystem.
Configuration and secrets should be mounted, not baked into images.
Storage choices affect performance and reliability.

Common volume types

emptyDir: temporary storage that lives for the Pod lifetime.
configMap / secret: configuration injection as files.
persistentVolumeClaim (PVC): durable storage provisioned by a StorageClass.
hostPath: mounts a host directory (use with care, not portable).

emptyDir example

emptyDir is great for caches and scratch space:

volumes:
  - name: cache
    emptyDir: {}
volumeMounts:
  - name: cache
    mountPath: /cache

You can size it with emptyDir.sizeLimit and use medium: Memory for tmpfs if you want fast memory-backed storage.

ConfigMap and Secret mounts

Mount configuration files into a container:

volumes:
  - name: app-config
    configMap:
      name: app-config
  - name: app-secret
    secret:
      secretName: app-secret
volumeMounts:
  - name: app-config
    mountPath: /etc/app/config
    readOnly: true
  - name: app-secret
    mountPath: /etc/app/secret
    readOnly: true

Keep secrets read-only and avoid putting them in environment variables unless necessary.

PVC basics

PVCs request storage from a StorageClass and bind to a PersistentVolume:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

Mount the PVC in your Pod:

volumes:
  - name: data
    persistentVolumeClaim:
      claimName: data-pvc
volumeMounts:
  - name: data
    mountPath: /var/lib/app

StorageClass selection

If your cluster has multiple StorageClasses, pick the right one. A fast SSD class is good for databases, while a cheaper class can serve logs or caches. Check what is available:

kubectl get storageclass

You can set a default StorageClass or request a specific class in your PVC:

storageClassName: fast-ssd

Access modes and usage

ReadWriteOnce is most common for databases.
ReadWriteMany is required for shared filesystems.
ReadOnlyMany is useful for distributing data without writes.

Match access modes to your workload and storage backend.

Example: shared volume for multiple containers

If your Pod has a sidecar that needs to read files produced by the main container, mount the same volume into both containers:

apiVersion: v1
kind: Pod
metadata:
  name: producer-consumer
spec:
  volumes:
  - name: shared
    emptyDir: {}
  containers:
  - name: producer
    image: busybox
    command: ["sh", "-c", "while true; do date >> /data/out.txt; sleep 5; done"]
    volumeMounts:
    - name: shared
      mountPath: /data
  - name: consumer
    image: busybox
    command: ["sh", "-c", "tail -f /data/out.txt"]
    volumeMounts:
    - name: shared
      mountPath: /data

This pattern is useful for log shipping, content generation, or preprocessing pipelines.

Stateful workloads

Databases and queues usually require PVCs and a stable identity. In Kubernetes, StatefulSets provide stable network names and storage claims for each replica. If you are using a Deployment for stateful apps, you are likely to hit data corruption or scheduling problems.

Example: StatefulSet volume claim template

StatefulSets can create a PVC per replica automatically:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis
spec:
  serviceName: redis
  replicas: 2
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - name: redis
        image: redis:7
        volumeMounts:
        - name: data
          mountPath: /data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 5Gi

Each replica gets its own PVC (for example, data-redis-0, data-redis-1).

Lifecycle and cleanup

PVCs are not deleted when a Pod is removed. This protects data, but it also means you can leak storage if you forget to clean up old claims. If you want PVCs removed automatically, configure the StorageClass reclaim policy or delete PVCs manually after you tear down a workload.

Be careful with reclaim policies. Retain keeps data around for recovery, while Delete removes the underlying volume automatically. Pick based on how critical the data is.

Backup and migration considerations

Volumes are not backups. If a node fails or data is corrupted, you need a separate backup strategy. For quick tests, a snapshot or a logical dump is enough. For production, define backup retention, verify restores, and store backups outside the cluster.

When migrating between storage classes, create a new PVC, copy data with a helper Pod, and then update your workload to mount the new claim. This reduces downtime and makes rollback simpler.

You can do the copy with a simple rsync Pod that mounts both the old and new PVCs, then switch the workload once data is in sync.

Practical selection tips

Use emptyDir for caches and build artifacts.
Use PVCs for databases and any persistent state.
Avoid hostPath unless you control the node and accept portability limits.
Mount configs and secrets as files whenever possible.
Prefer stable volume names so application code and runbooks stay consistent.

Security considerations

Mounting a volume into a container means that data is shared at the filesystem level. Use read-only mounts for config and secrets, and limit write access to the minimum necessary path. If you use hostPath, remember it exposes node files directly to the container, which can be a major security risk.

If you run multi-tenant clusters, combine volume policies with PodSecurity and RBAC to avoid unintended access to shared storage.

Encryption at rest depends on your storage provider; enable it when handling sensitive data. For configs and secrets, use rotation and keep old values out of images.

Troubleshooting

Pod stuck in Pending: PVC not bound or StorageClass missing.
Mount errors: check node permissions and CSI driver logs.
Data loss: verify reclaim policy and Pod eviction behavior.
Permission denied: ensure the container user matches the volume filesystem permissions.
Volume full: increase PVC size if the StorageClass allows expansion.

Diagnostic commands:

kubectl describe pod <pod-name>
kubectl describe pvc <pvc-name>
kubectl get pv

For deeper storage issues, check the CSI controller logs in the kube-system namespace. Many mount failures surface there before they appear in Pod events.

If your storage class supports expansion, you can grow a PVC by updating its spec.resources.requests.storage and watching the resize events.

Performance tuning tips

If IO latency is high, check the underlying storage class and node disk. For small lab clusters, local-path storage is often slower under load. In cloud environments, pick SSD-backed classes for databases and tune volume sizes based on expected write throughput.

You can also benchmark with a simple temporary Pod (for example, running fio) to understand baseline latency before putting production data on the volume.

Avoid placing latency-sensitive databases on the same node as CPU-heavy batch workloads. Resource contention is one of the most common causes of noisy volume performance.

Practical notes

Start with a quick inventory: kubectl get nodes, kubectl get pods -A, and kubectl get events -A.
Compare desired vs. observed state; kubectl describe usually explains drift or failed controllers.
Keep names, labels, and selectors consistent so Services and controllers can find Pods.

Quick checklist

The resource matches the intent you described in YAML.
Namespaces, RBAC, and images are correct for the target environment.
Health checks and logs are in place before promotion.

Field checklist

When you move from a quick lab to real traffic, confirm the basics every time. Check resource requests, readiness behavior, log coverage, alerting, and clear rollback steps. A checklist prevents skipping the boring steps that keep services stable. Keep it short, repeatable, and stored with the repo so it evolves with the service and stays close to the code.

Troubleshooting flow

Start from symptoms, not guesses. Review recent events for scheduling, image, or probe failures, then scan logs for application errors. If traffic is failing, confirm readiness, verify endpoints, and trace the request path hop by hop. When data looks wrong, validate the active version and configuration against the release plan. Always record what you changed so a rollback is fast and a postmortem is accurate.

Small exercises to build confidence

Practice common operations in a safe environment. Scale the workload up and down and observe how quickly it stabilizes. Restart a single Pod and watch how the service routes around it. Change one configuration value and verify that the change is visible in logs or metrics. These small drills teach how the system behaves under real operations without waiting for an outage.

Production guardrails

Introduce limits gradually. Resource quotas, PodDisruptionBudgets, and network policies should be tested in staging before production. Keep backups and restore procedures documented, even for stateless services, because dependencies often are not stateless. Align monitoring with user outcomes so you catch regressions before they become incidents.

Documentation and ownership

Write down who owns the service, what success looks like, and which dashboards to use. Include the on-call rotation, escalation path, and basic runbooks for common failures. A small amount of documentation removes a lot of guesswork during incidents and helps new team members ramp up quickly.

Quick validation

After any change, validate the system the same way a user would. Hit the main endpoint, check latency, and watch for error spikes. Confirm that new pods are ready, old ones are gone, and metrics are stable. If the change touched storage, verify disk usage and cleanup behavior. If it touched networking, confirm DNS names and endpoint lists are correct.

Release notes

Write a short note with what changed, why it changed, and how to roll back. This is not bureaucracy; it prevents confusion during incidents. Even a few bullets help future you remember intent and context.

Capacity check

Compare current usage to requests and limits. If the service is close to limits, plan a small scaling adjustment before traffic grows. Capacity planning is easier when it is incremental rather than reactive.

Final reminder

Keep changes small and observable. If a release is risky, reduce scope and validate in staging first. Prefer frequent small updates over rare large ones. When in doubt, pick the option that simplifies rollback and reduces time to detect issues. The goal is not perfect config, but predictable operations.