Source from repo
Kubernetes Specialist

Deploy and manage Kubernetes workloads: manifests, RBAC, Helm charts, service mesh, GitOps, and troubleshooting.
jeffallanGitHub jeffallanSource repo Original GitHub link Publisher page
Files
Skill
n/a
Size
129.5 KB
Entrypoint
SKILL.md
Format
git-repo
Open file
references/cost-optimization.md

Syntax-highlighted preview of this file as included in the skill package.
Rendered Source
markdown459 linesFree
references/cost-optimization.md
1# Cost Optimization
2 
3---
4 
5## Resource Right-Sizing
6 
7### Analyze Current Usage
8 
9```bash
10# View resource requests vs actual usage
11kubectl top pods -n production
12 
13# Detailed resource metrics (requires metrics-server)
14kubectl get pods -n production -o custom-columns=\
15"NAME:.metadata.name,\
16CPU_REQ:.spec.containers[*].resources.requests.cpu,\
17CPU_LIM:.spec.containers[*].resources.limits.cpu,\
18MEM_REQ:.spec.containers[*].resources.requests.memory,\
19MEM_LIM:.spec.containers[*].resources.limits.memory"
20 
21# Get VPA recommendations (if VPA installed)
22kubectl get vpa -n production -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{.status.recommendation.containerRecommendations[*]}{"\n\n"}{end}'
23```
24 
25### Right-Sized Resource Spec
26 
27```yaml
28apiVersion: apps/v1
29kind: Deployment
30metadata:
31  name: myapp
32  namespace: production
33spec:
34  template:
35    spec:
36      containers:
37        - name: myapp
38          resources:
39            requests:
40              # Set to average usage + 10-20% buffer
41              cpu: 100m
42              memory: 128Mi
43            limits:
44              # CPU: 2-4x requests for burst capacity
45              # Memory: 1.5-2x requests (OOM prevention)
46              cpu: 500m
47              memory: 256Mi
48```
49 
50## Vertical Pod Autoscaler (VPA)
51 
52```yaml
53apiVersion: autoscaling.k8s.io/v1
54kind: VerticalPodAutoscaler
55metadata:
56  name: myapp-vpa
57  namespace: production
58spec:
59  targetRef:
60    apiVersion: apps/v1
61    kind: Deployment
62    name: myapp
63  updatePolicy:
64    # Off - only provide recommendations
65    # Initial - apply only on pod creation
66    # Auto - apply on pod creation and during runtime (with restart)
67    updateMode: "Auto"
68  resourcePolicy:
69    containerPolicies:
70      - containerName: myapp
71        minAllowed:
72          cpu: 50m
73          memory: 64Mi
74        maxAllowed:
75          cpu: 2000m
76          memory: 2Gi
77        controlledResources: ["cpu", "memory"]
78        controlledValues: RequestsAndLimits
79```
80 
81### VPA Recommendation Only
82 
83```yaml
84apiVersion: autoscaling.k8s.io/v1
85kind: VerticalPodAutoscaler
86metadata:
87  name: myapp-vpa-recommender
88  namespace: production
89spec:
90  targetRef:
91    apiVersion: apps/v1
92    kind: Deployment
93    name: myapp
94  updatePolicy:
95    updateMode: "Off"
96```
97 
98## Horizontal Pod Autoscaler (HPA) Tuning
99 
100```yaml
101apiVersion: autoscaling/v2
102kind: HorizontalPodAutoscaler
103metadata:
104  name: myapp-hpa
105  namespace: production
106spec:
107  scaleTargetRef:
108    apiVersion: apps/v1
109    kind: Deployment
110    name: myapp
111  minReplicas: 2
112  maxReplicas: 20
113  metrics:
114    # CPU-based scaling
115    - type: Resource
116      resource:
117        name: cpu
118        target:
119          type: Utilization
120          averageUtilization: 70
121 
122    # Memory-based scaling
123    - type: Resource
124      resource:
125        name: memory
126        target:
127          type: Utilization
128          averageUtilization: 80
129 
130    # Custom metrics (e.g., requests per second)
131    - type: Pods
132      pods:
133        metric:
134          name: http_requests_per_second
135        target:
136          type: AverageValue
137          averageValue: 100
138 
139  behavior:
140    scaleDown:
141      stabilizationWindowSeconds: 300
142      policies:
143        - type: Percent
144          value: 10
145          periodSeconds: 60
146        - type: Pods
147          value: 2
148          periodSeconds: 60
149      selectPolicy: Min
150    scaleUp:
151      stabilizationWindowSeconds: 0
152      policies:
153        - type: Percent
154          value: 100
155          periodSeconds: 15
156        - type: Pods
157          value: 4
158          periodSeconds: 15
159      selectPolicy: Max
160```
161 
162## Spot/Preemptible Instances
163 
164### Node Pool with Spot Instances (GKE)
165 
166```yaml
167apiVersion: container.google.com/v1
168kind: NodePool
169metadata:
170  name: spot-pool
171spec:
172  config:
173    machineType: e2-standard-4
174    preemptible: true
175    taints:
176      - key: cloud.google.com/gke-spot
177        value: "true"
178        effect: NoSchedule
179  autoscaling:
180    enabled: true
181    minNodeCount: 0
182    maxNodeCount: 10
183```
184 
185### Workload Tolerating Spot Nodes
186 
187```yaml
188apiVersion: apps/v1
189kind: Deployment
190metadata:
191  name: batch-processor
192  namespace: production
193spec:
194  template:
195    spec:
196      tolerations:
197        - key: cloud.google.com/gke-spot
198          operator: Equal
199          value: "true"
200          effect: NoSchedule
201        - key: kubernetes.azure.com/scalesetpriority
202          operator: Equal
203          value: spot
204          effect: NoSchedule
205      affinity:
206        nodeAffinity:
207          preferredDuringSchedulingIgnoredDuringExecution:
208            - weight: 100
209              preference:
210                matchExpressions:
211                  - key: cloud.google.com/gke-spot
212                    operator: In
213                    values: ["true"]
214      containers:
215        - name: processor
216          # ... container spec
217```
218 
219### Pod Disruption Budget for Spot
220 
221```yaml
222apiVersion: policy/v1
223kind: PodDisruptionBudget
224metadata:
225  name: myapp-pdb
226  namespace: production
227spec:
228  minAvailable: 2
229  # OR maxUnavailable: 1
230  selector:
231    matchLabels:
232      app: myapp
233```
234 
235## Namespace Quotas
236 
237```yaml
238apiVersion: v1
239kind: ResourceQuota
240metadata:
241  name: production-quota
242  namespace: production
243spec:
244  hard:
245    requests.cpu: "20"
246    requests.memory: 40Gi
247    limits.cpu: "40"
248    limits.memory: 80Gi
249    persistentvolumeclaims: "10"
250    requests.storage: 500Gi
251    pods: "50"
252    services: "20"
253    secrets: "50"
254    configmaps: "50"
255---
256apiVersion: v1
257kind: ResourceQuota
258metadata:
259  name: production-object-counts
260  namespace: production
261spec:
262  hard:
263    count/deployments.apps: "20"
264    count/statefulsets.apps: "5"
265    count/jobs.batch: "10"
266```
267 
268## LimitRange
269 
270```yaml
271apiVersion: v1
272kind: LimitRange
273metadata:
274  name: production-limits
275  namespace: production
276spec:
277  limits:
278    # Default limits for containers
279    - type: Container
280      default:
281        cpu: 500m
282        memory: 256Mi
283      defaultRequest:
284        cpu: 100m
285        memory: 128Mi
286      min:
287        cpu: 50m
288        memory: 64Mi
289      max:
290        cpu: 4000m
291        memory: 8Gi
292 
293    # Pod-level limits
294    - type: Pod
295      max:
296        cpu: 8000m
297        memory: 16Gi
298 
299    # PVC limits
300    - type: PersistentVolumeClaim
301      min:
302        storage: 1Gi
303      max:
304        storage: 100Gi
305```
306 
307## Cluster Autoscaler Configuration
308 
309```yaml
310apiVersion: v1
311kind: ConfigMap
312metadata:
313  name: cluster-autoscaler-config
314  namespace: kube-system
315data:
316  config: |
317    {
318      "scaleDownDelayAfterAdd": "10m",
319      "scaleDownDelayAfterDelete": "0s",
320      "scaleDownDelayAfterFailure": "3m",
321      "scaleDownUnneededTime": "10m",
322      "scaleDownUnreadyTime": "20m",
323      "scaleDownUtilizationThreshold": "0.5",
324      "skipNodesWithLocalStorage": "false",
325      "skipNodesWithSystemPods": "true",
326      "balanceSimilarNodeGroups": "true",
327      "expander": "least-waste"
328    }
329```
330 
331## Cost Monitoring
332 
333### Kubecost Deployment
334 
335```bash
336# Install Kubecost
337helm repo add kubecost https://kubecost.github.io/cost-analyzer/
338helm install kubecost kubecost/cost-analyzer \
339  --namespace kubecost \
340  --create-namespace \
341  --set kubecostToken="YOUR_TOKEN"
342```
343 
344### Prometheus Cost Metrics
345 
346```yaml
347# Pod cost label for attribution
348apiVersion: apps/v1
349kind: Deployment
350metadata:
351  name: myapp
352  labels:
353    cost-center: engineering
354    team: platform
355    environment: production
356spec:
357  template:
358    metadata:
359      labels:
360        cost-center: engineering
361        team: platform
362```
363 
364## Scheduled Scaling
365 
366```yaml
367# Scale down dev environments overnight
368apiVersion: batch/v1
369kind: CronJob
370metadata:
371  name: scale-down-dev
372  namespace: development
373spec:
374  schedule: "0 20 * * 1-5"  # 8 PM Mon-Fri
375  jobTemplate:
376    spec:
377      template:
378        spec:
379          serviceAccountName: scaler
380          containers:
381            - name: kubectl
382              image: bitnami/kubectl:latest
383              command:
384                - /bin/sh
385                - -c
386                - |
387                  kubectl scale deployment --all --replicas=0 -n development
388          restartPolicy: OnFailure
389---
390apiVersion: batch/v1
391kind: CronJob
392metadata:
393  name: scale-up-dev
394  namespace: development
395spec:
396  schedule: "0 8 * * 1-5"  # 8 AM Mon-Fri
397  jobTemplate:
398    spec:
399      template:
400        spec:
401          serviceAccountName: scaler
402          containers:
403            - name: kubectl
404              image: bitnami/kubectl:latest
405              command:
406                - /bin/sh
407                - -c
408                - |
409                  kubectl scale deployment frontend --replicas=2 -n development
410                  kubectl scale deployment backend --replicas=2 -n development
411          restartPolicy: OnFailure
412```
413 
414## Priority Classes
415 
416```yaml
417apiVersion: scheduling.k8s.io/v1
418kind: PriorityClass
419metadata:
420  name: high-priority
421value: 1000000
422globalDefault: false
423description: "Critical production workloads"
424---
425apiVersion: scheduling.k8s.io/v1
426kind: PriorityClass
427metadata:
428  name: low-priority
429value: 100
430globalDefault: false
431preemptionPolicy: Never
432description: "Batch jobs that can be preempted"
433---
434apiVersion: apps/v1
435kind: Deployment
436metadata:
437  name: batch-job
438spec:
439  template:
440    spec:
441      priorityClassName: low-priority
442      # ...
443```
444 
445## Best Practices
446 
4471. **Set resource requests** on all containers (enables efficient scheduling)
4482. **Use VPA recommendations** to right-size workloads
4493. **Tune HPA stabilization** to prevent thrashing
4504. **Leverage spot instances** for fault-tolerant workloads
4515. **Implement PDBs** to maintain availability during disruptions
4526. **Set namespace quotas** to prevent resource hogging
4537. **Use LimitRanges** to enforce sensible defaults
4548. **Label resources** for cost attribution
4559. **Schedule dev environments** to scale down off-hours
45610. **Monitor with Kubecost** or cloud cost tools
45711. **Use priority classes** to ensure critical workloads run
45812. **Review unused resources** regularly (idle deployments, orphaned PVCs)
459
Preparing the source view

Kubernetes Specialist

references/cost-optimization.md