Kubernetes Best Practices for Production Deployments

Running applications in production on Kubernetes is challenging, but following proven best practices can make the difference between a reliable system and a maintenance nightmare. This guide covers the essential strategies that successful teams use to deploy and maintain production workloads.

What You'll Learn

By the end of this guide, you'll understand:

How to design resilient deployments that handle failures gracefully
Security practices that protect your applications from common threats
Monitoring strategies that give you visibility into your system's health
Scaling patterns that keep your applications performant under load
Backup and recovery procedures that protect your data

1. Resource Management: The Foundation of Stability

Why Resource Management Matters

Resource management is the cornerstone of stable Kubernetes deployments. Without proper resource allocation, your applications can experience:

Resource starvation when one pod consumes all available CPU/memory
OOM kills when containers exceed memory limits
Poor performance due to CPU throttling
Unpredictable scaling behavior

Best Practice: Always Set Resource Limits

resources.yaml

1apiVersion: apps/v1
2kind: Deployment
3metadata:
4  name: production-app
5spec:
6  replicas: 3
7  selector:
8    matchLabels:
9      app: production-app
10  template:
11    metadata:
12      labels:
13        app: production-app
14    spec:
15      containers:
16      - name: app
17        image: production-app:v1.2.0
18        resources:
19          requests:
20            memory: "512Mi"
21            cpu: "250m"
22          limits:
23            memory: "1Gi"
24            cpu: "500m"
25        livenessProbe:
26          httpGet:
27            path: /health
28            port: 8080
29          initialDelaySeconds: 30
30          periodSeconds: 10
31          timeoutSeconds: 5
32          failureThreshold: 3
33        readinessProbe:
34          httpGet:
35            path: /ready
36            port: 8080
37          initialDelaySeconds: 5
38          periodSeconds: 5
39          timeoutSeconds: 3
40          failureThreshold: 3

Key Points:

Requests: Guaranteed resources your pod will receive
Limits: Maximum resources your pod can use
Health Checks: Essential for Kubernetes to know when pods are healthy
Realistic Values: Base limits on actual usage patterns, not guesses

Pro Tip: Use Resource Quotas

resource-quota.yaml

1apiVersion: v1
2kind: ResourceQuota
3metadata:
4  name: production-quota
5spec:
6  hard:
7    requests.cpu: "4"
8    requests.memory: 8Gi
9    limits.cpu: "8"
10    limits.memory: 16Gi
11    pods: "20"

2. Security: Protect Your Applications

The Security-First Approach

Security in Kubernetes requires careful attention. The principle of least privilege should guide every security decision.

Best Practice: Run as Non-Root

run-as-non-root.yaml

1apiVersion: apps/v1
2kind: Deployment
3metadata:
4  name: secure-app
5spec:
6  template:
7    spec:
8      securityContext:
9        runAsNonRoot: true
10        runAsUser: 1000
11        fsGroup: 2000
12      containers:
13      - name: app
14        image: secure-app:latest
15        securityContext:
16          allowPrivilegeEscalation: false
17          readOnlyRootFilesystem: true
18          capabilities:
19            drop:
20            - ALL
21        volumeMounts:
22        - name: tmp
23          mountPath: /tmp
24        - name: varlog
25          mountPath: /var/log
26        - name: app-config
27          mountPath: /app/config
28          readOnly: true
29      volumes:
30      - name: tmp
31        emptyDir: {}
32      - name: varlog
33        emptyDir: {}
34      - name: app-config
35        configMap:
36          name: app-config

Security Benefits:

Non-root execution prevents privilege escalation attacks
Read-only filesystem prevents malicious file modifications
Dropped capabilities remove unnecessary privileges
ConfigMap mounting keeps configuration separate and secure

Pro Tip: Use Network Policies

network-policy.yaml

1apiVersion: networking.k8s.io/v1
2kind: NetworkPolicy
3metadata:
4  name: app-network-policy
5spec:
6  podSelector:
7    matchLabels:
8      app: production-app
9  policyTypes:
10  - Ingress
11  - Egress
12  ingress:
13  - from:
14    - namespaceSelector:
15        matchLabels:
16          name: frontend
17    ports:
18    - protocol: TCP
19      port: 8080
20  egress:
21  - to:
22    - namespaceSelector:
23        matchLabels:
24          name: database
25    ports:
26    - protocol: TCP
27      port: 5432

3. Monitoring and Observability: Know Your System

The Three Pillars of Observability

Metrics: Quantitative data about your system's performance
Logs: Detailed records of events and errors
Traces: Request flow through your distributed system

Best Practice: Comprehensive Monitoring Setup

prometheus-config.yaml

1apiVersion: v1
2kind: ConfigMap
3metadata:
4  name: prometheus-config
5data:
6  prometheus.yml: |
7    global:
8      scrape_interval: 15s
9      evaluation_interval: 15s
10    
11    rule_files:
12      - "alert_rules.yml"
13    
14    scrape_configs:
15    - job_name: 'kubernetes-pods'
16      kubernetes_sd_configs:
17      - role: pod
18      relabel_configs:
19      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
20        action: keep
21        regex: true
22      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
23        action: replace
24        target_label: __metrics_path__
25        regex: (.+)
26      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
27        action: replace
28        regex: ([^:]+)(?::\d+)?;(\d+)
29        replacement: $1:$2
30        target_label: __address__

Essential Metrics to Monitor

alert-rules.yaml

1# Example alerting rules
2groups:
3- name: kubernetes.rules
4  rules:
5  - alert: HighCPUUsage
6    expr: container_cpu_usage_seconds_total{container!=""} > 0.8
7    for: 5m
8    labels:
9      severity: warning
10    annotations:
11      summary: "High CPU usage detected"
12      description: "Container {{ $labels.container }} is using {{ $value }} CPU"
13
14  - alert: HighMemoryUsage
15    expr: container_memory_usage_bytes{container!=""} / container_spec_memory_limit_bytes{container!=""} > 0.85
16    for: 5m
17    labels:
18      severity: warning
19    annotations:
20      summary: "High memory usage detected"
21      description: "Container {{ $labels.container }} is using {{ $value | humanizePercentage }} memory"

4. Scaling Strategies: Handle Traffic Spikes

Horizontal vs Vertical Scaling

Horizontal scaling (adding more pods) is generally preferred in Kubernetes because it's more resilient and can handle traffic spikes better.

Best Practice: Implement HPA with Multiple Metrics

hpa-config.yaml

1apiVersion: autoscaling/v2
2kind: HorizontalPodAutoscaler
3metadata:
4  name: production-app-hpa
5spec:
6  scaleTargetRef:
7    apiVersion: apps/v1
8    kind: Deployment
9    name: production-app
10  minReplicas: 3
11  maxReplicas: 20
12  metrics:
13  - type: Resource
14    resource:
15      name: cpu
16      target:
17        type: Utilization
18        averageUtilization: 70
19  - type: Resource
20    resource:
21      name: memory
22      target:
23        type: Utilization
24        averageUtilization: 80
25  - type: Object
26    object:
27      metric:
28        name: requests-per-second
29      describedObject:
30        apiVersion: networking.k8s.io/v1
31        kind: Ingress
32        name: production-app-ingress
33      target:
34        type: Value
35        value: 1000
36  behavior:
37    scaleDown:
38      stabilizationWindowSeconds: 300
39      policies:
40      - type: Percent
41        value: 10
42        periodSeconds: 60
43    scaleUp:
44      stabilizationWindowSeconds: 60
45      policies:
46      - type: Percent
47        value: 100
48        periodSeconds: 15

Key Features:

Multiple metrics: CPU, memory, and custom metrics
Stabilization windows: Prevent rapid scaling oscillations
Conservative scale-down: Avoid scaling down too aggressively
Aggressive scale-up: Respond quickly to traffic spikes

Pro Tip: Use VPA for Vertical Scaling

vpa-config.yaml

1apiVersion: autoscaling.k8s.io/v1
2kind: VerticalPodAutoscaler
3metadata:
4  name: production-app-vpa
5spec:
6  targetRef:
7    apiVersion: apps/v1
8    kind: Deployment
9    name: production-app
10  updatePolicy:
11    updateMode: "Off"  # Use "Auto" for automatic updates
12  resourcePolicy:
13    containerPolicies:
14    - containerName: '*'
15      minAllowed:
16        cpu: 100m
17        memory: 50Mi
18      maxAllowed:
19        cpu: 1
20        memory: 500Mi
21      controlledValues: RequestsAndLimits

5. Backup and Disaster Recovery: Protect Your Data

The 3-2-1 Backup Rule

3 copies of your data
2 different storage types
1 off-site backup

Best Practice: Automated Backup Strategy

velero-backup.yaml

1apiVersion: velero.io/v1
2kind: Schedule
3metadata:
4  name: production-daily-backup
5spec:
6  schedule: "0 2 * * *"  # Daily at 2 AM
7  template:
8    includedNamespaces:
9    - production
10    includedResources:
11    - deployments
12    - services
13    - configmaps
14    - secrets
15    - persistentvolumeclaims
16    - persistentvolumes
17    storageLocation: production-backup
18    volumeSnapshotLocations:
19    - production-snapshot
20    ttl: 720h  # Keep backups for 30 days

Backup Verification Strategy

backup-verification.yaml

1apiVersion: velero.io/v1
2kind: Schedule
3metadata:
4  name: backup-verification
5spec:
6  schedule: "0 4 * * 0"  # Weekly on Sunday at 4 AM
7  template:
8    includedNamespaces:
9    - backup-test
10    includedResources:
11    - deployments
12    - services
13    - configmaps
14    - secrets
15    - persistentvolumeclaims
16    storageLocation: production-backup
17    volumeSnapshotLocations:
18    - production-snapshot
19    ttl: 24h

6. Deployment Strategies: Zero-Downtime Updates

Rolling Updates with Health Checks

rolling-update.yaml

1apiVersion: apps/v1
2kind: Deployment
3metadata:
4  name: production-app
5spec:
6  replicas: 5
7  strategy:
8    type: RollingUpdate
9    rollingUpdate:
10      maxSurge: 1
11      maxUnavailable: 0
12  template:
13    spec:
14      containers:
15      - name: app
16        image: production-app:v1.2.0
17        readinessProbe:
18          httpGet:
19            path: /ready
20            port: 8080
21          initialDelaySeconds: 5
22          periodSeconds: 5
23          timeoutSeconds: 3
24          failureThreshold: 3
25        livenessProbe:
26          httpGet:
27            path: /health
28            port: 8080
29          initialDelaySeconds: 30
30          periodSeconds: 10
31          timeoutSeconds: 5
32          failureThreshold: 3

Benefits:

Zero downtime: New pods are ready before old ones are terminated
Rollback capability: Easy to revert to previous version
Health verification: Only healthy pods serve traffic

Key Takeaways

Immediate Actions You Can Take

Set resource limits on all your containers today
Implement health checks for every application
Run containers as non-root users
Set up basic monitoring with Prometheus
Configure HPA for your critical applications
Implement automated backups with Velero

Long-term Strategy

Gradually implement security policies
Build comprehensive monitoring dashboards
Test disaster recovery procedures regularly
Optimize resource usage based on monitoring data
Automate everything possible

Most importantly, keep it simple! Overcomplicating your infrasturcutre will result in unimaginable growing pains.

Remember: Production Kubernetes is a journey, not a destination. Start with these fundamentals and continuously improve based on your specific needs and challenges.

Need help implementing these best practices? Join us on Slack

Kubernetes Best Practices for Production Deployments

Kubernetes Best Practices for Production Deployments

What You'll Learn

1. Resource Management: The Foundation of Stability

Why Resource Management Matters

Best Practice: Always Set Resource Limits

Pro Tip: Use Resource Quotas

2. Security: Protect Your Applications

The Security-First Approach

Best Practice: Run as Non-Root

Pro Tip: Use Network Policies

3. Monitoring and Observability: Know Your System

The Three Pillars of Observability

Best Practice: Comprehensive Monitoring Setup

Essential Metrics to Monitor

4. Scaling Strategies: Handle Traffic Spikes

Horizontal vs Vertical Scaling

Best Practice: Implement HPA with Multiple Metrics

Pro Tip: Use VPA for Vertical Scaling

5. Backup and Disaster Recovery: Protect Your Data

The 3-2-1 Backup Rule

Best Practice: Automated Backup Strategy

Backup Verification Strategy

6. Deployment Strategies: Zero-Downtime Updates

Rolling Updates with Health Checks

Key Takeaways

Immediate Actions You Can Take

Long-term Strategy

Related Posts

The Minimalist’s Guide to Homelab Setup

Building Pipeline Agents with the Ankra CLI