Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Deploy and manage Kubernetes workloads: manifests, RBAC, Helm charts, service mesh, GitOps, and troubleshooting.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
references/cost-optimization.md
1# Cost Optimization23---45## Resource Right-Sizing67### Analyze Current Usage89```bash10# View resource requests vs actual usage11kubectl top pods -n production1213# Detailed resource metrics (requires metrics-server)14kubectl get pods -n production -o custom-columns=\15"NAME:.metadata.name,\16CPU_REQ:.spec.containers[*].resources.requests.cpu,\17CPU_LIM:.spec.containers[*].resources.limits.cpu,\18MEM_REQ:.spec.containers[*].resources.requests.memory,\19MEM_LIM:.spec.containers[*].resources.limits.memory"2021# Get VPA recommendations (if VPA installed)22kubectl get vpa -n production -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{.status.recommendation.containerRecommendations[*]}{"\n\n"}{end}'23```2425### Right-Sized Resource Spec2627```yaml28apiVersion: apps/v129kind: Deployment30metadata:31name: myapp32namespace: production33spec:34template:35spec:36containers:37- name: myapp38resources:39requests:40# Set to average usage + 10-20% buffer41cpu: 100m42memory: 128Mi43limits:44# CPU: 2-4x requests for burst capacity45# Memory: 1.5-2x requests (OOM prevention)46cpu: 500m47memory: 256Mi48```4950## Vertical Pod Autoscaler (VPA)5152```yaml53apiVersion: autoscaling.k8s.io/v154kind: VerticalPodAutoscaler55metadata:56name: myapp-vpa57namespace: production58spec:59targetRef:60apiVersion: apps/v161kind: Deployment62name: myapp63updatePolicy:64# Off - only provide recommendations65# Initial - apply only on pod creation66# Auto - apply on pod creation and during runtime (with restart)67updateMode: "Auto"68resourcePolicy:69containerPolicies:70- containerName: myapp71minAllowed:72cpu: 50m73memory: 64Mi74maxAllowed:75cpu: 2000m76memory: 2Gi77controlledResources: ["cpu", "memory"]78controlledValues: RequestsAndLimits79```8081### VPA Recommendation Only8283```yaml84apiVersion: autoscaling.k8s.io/v185kind: VerticalPodAutoscaler86metadata:87name: myapp-vpa-recommender88namespace: production89spec:90targetRef:91apiVersion: apps/v192kind: Deployment93name: myapp94updatePolicy:95updateMode: "Off"96```9798## Horizontal Pod Autoscaler (HPA) Tuning99100```yaml101apiVersion: autoscaling/v2102kind: HorizontalPodAutoscaler103metadata:104name: myapp-hpa105namespace: production106spec:107scaleTargetRef:108apiVersion: apps/v1109kind: Deployment110name: myapp111minReplicas: 2112maxReplicas: 20113metrics:114# CPU-based scaling115- type: Resource116resource:117name: cpu118target:119type: Utilization120averageUtilization: 70121122# Memory-based scaling123- type: Resource124resource:125name: memory126target:127type: Utilization128averageUtilization: 80129130# Custom metrics (e.g., requests per second)131- type: Pods132pods:133metric:134name: http_requests_per_second135target:136type: AverageValue137averageValue: 100138139behavior:140scaleDown:141stabilizationWindowSeconds: 300142policies:143- type: Percent144value: 10145periodSeconds: 60146- type: Pods147value: 2148periodSeconds: 60149selectPolicy: Min150scaleUp:151stabilizationWindowSeconds: 0152policies:153- type: Percent154value: 100155periodSeconds: 15156- type: Pods157value: 4158periodSeconds: 15159selectPolicy: Max160```161162## Spot/Preemptible Instances163164### Node Pool with Spot Instances (GKE)165166```yaml167apiVersion: container.google.com/v1168kind: NodePool169metadata:170name: spot-pool171spec:172config:173machineType: e2-standard-4174preemptible: true175taints:176- key: cloud.google.com/gke-spot177value: "true"178effect: NoSchedule179autoscaling:180enabled: true181minNodeCount: 0182maxNodeCount: 10183```184185### Workload Tolerating Spot Nodes186187```yaml188apiVersion: apps/v1189kind: Deployment190metadata:191name: batch-processor192namespace: production193spec:194template:195spec:196tolerations:197- key: cloud.google.com/gke-spot198operator: Equal199value: "true"200effect: NoSchedule201- key: kubernetes.azure.com/scalesetpriority202operator: Equal203value: spot204effect: NoSchedule205affinity:206nodeAffinity:207preferredDuringSchedulingIgnoredDuringExecution:208- weight: 100209preference:210matchExpressions:211- key: cloud.google.com/gke-spot212operator: In213values: ["true"]214containers:215- name: processor216# ... container spec217```218219### Pod Disruption Budget for Spot220221```yaml222apiVersion: policy/v1223kind: PodDisruptionBudget224metadata:225name: myapp-pdb226namespace: production227spec:228minAvailable: 2229# OR maxUnavailable: 1230selector:231matchLabels:232app: myapp233```234235## Namespace Quotas236237```yaml238apiVersion: v1239kind: ResourceQuota240metadata:241name: production-quota242namespace: production243spec:244hard:245requests.cpu: "20"246requests.memory: 40Gi247limits.cpu: "40"248limits.memory: 80Gi249persistentvolumeclaims: "10"250requests.storage: 500Gi251pods: "50"252services: "20"253secrets: "50"254configmaps: "50"255---256apiVersion: v1257kind: ResourceQuota258metadata:259name: production-object-counts260namespace: production261spec:262hard:263count/deployments.apps: "20"264count/statefulsets.apps: "5"265count/jobs.batch: "10"266```267268## LimitRange269270```yaml271apiVersion: v1272kind: LimitRange273metadata:274name: production-limits275namespace: production276spec:277limits:278# Default limits for containers279- type: Container280default:281cpu: 500m282memory: 256Mi283defaultRequest:284cpu: 100m285memory: 128Mi286min:287cpu: 50m288memory: 64Mi289max:290cpu: 4000m291memory: 8Gi292293# Pod-level limits294- type: Pod295max:296cpu: 8000m297memory: 16Gi298299# PVC limits300- type: PersistentVolumeClaim301min:302storage: 1Gi303max:304storage: 100Gi305```306307## Cluster Autoscaler Configuration308309```yaml310apiVersion: v1311kind: ConfigMap312metadata:313name: cluster-autoscaler-config314namespace: kube-system315data:316config: |317{318"scaleDownDelayAfterAdd": "10m",319"scaleDownDelayAfterDelete": "0s",320"scaleDownDelayAfterFailure": "3m",321"scaleDownUnneededTime": "10m",322"scaleDownUnreadyTime": "20m",323"scaleDownUtilizationThreshold": "0.5",324"skipNodesWithLocalStorage": "false",325"skipNodesWithSystemPods": "true",326"balanceSimilarNodeGroups": "true",327"expander": "least-waste"328}329```330331## Cost Monitoring332333### Kubecost Deployment334335```bash336# Install Kubecost337helm repo add kubecost https://kubecost.github.io/cost-analyzer/338helm install kubecost kubecost/cost-analyzer \339--namespace kubecost \340--create-namespace \341--set kubecostToken="YOUR_TOKEN"342```343344### Prometheus Cost Metrics345346```yaml347# Pod cost label for attribution348apiVersion: apps/v1349kind: Deployment350metadata:351name: myapp352labels:353cost-center: engineering354team: platform355environment: production356spec:357template:358metadata:359labels:360cost-center: engineering361team: platform362```363364## Scheduled Scaling365366```yaml367# Scale down dev environments overnight368apiVersion: batch/v1369kind: CronJob370metadata:371name: scale-down-dev372namespace: development373spec:374schedule: "0 20 * * 1-5" # 8 PM Mon-Fri375jobTemplate:376spec:377template:378spec:379serviceAccountName: scaler380containers:381- name: kubectl382image: bitnami/kubectl:latest383command:384- /bin/sh385- -c386- |387kubectl scale deployment --all --replicas=0 -n development388restartPolicy: OnFailure389---390apiVersion: batch/v1391kind: CronJob392metadata:393name: scale-up-dev394namespace: development395spec:396schedule: "0 8 * * 1-5" # 8 AM Mon-Fri397jobTemplate:398spec:399template:400spec:401serviceAccountName: scaler402containers:403- name: kubectl404image: bitnami/kubectl:latest405command:406- /bin/sh407- -c408- |409kubectl scale deployment frontend --replicas=2 -n development410kubectl scale deployment backend --replicas=2 -n development411restartPolicy: OnFailure412```413414## Priority Classes415416```yaml417apiVersion: scheduling.k8s.io/v1418kind: PriorityClass419metadata:420name: high-priority421value: 1000000422globalDefault: false423description: "Critical production workloads"424---425apiVersion: scheduling.k8s.io/v1426kind: PriorityClass427metadata:428name: low-priority429value: 100430globalDefault: false431preemptionPolicy: Never432description: "Batch jobs that can be preempted"433---434apiVersion: apps/v1435kind: Deployment436metadata:437name: batch-job438spec:439template:440spec:441priorityClassName: low-priority442# ...443```444445## Best Practices4464471. **Set resource requests** on all containers (enables efficient scheduling)4482. **Use VPA recommendations** to right-size workloads4493. **Tune HPA stabilization** to prevent thrashing4504. **Leverage spot instances** for fault-tolerant workloads4515. **Implement PDBs** to maintain availability during disruptions4526. **Set namespace quotas** to prevent resource hogging4537. **Use LimitRanges** to enforce sensible defaults4548. **Label resources** for cost attribution4559. **Schedule dev environments** to scale down off-hours45610. **Monitor with Kubecost** or cloud cost tools45711. **Use priority classes** to ensure critical workloads run45812. **Review unused resources** regularly (idle deployments, orphaned PVCs)459