Scaling using Karpenter

Utilizing Spot instance at 60% discount rate for scaled nodes

Deployed using helm chart

✅ Installed Karpenter via Helm with IAM roles and OIDC trust on our EKS cluster
✅ Defined Provisioners based on workload needs (e.g., general, GPU, Spot)
✅ Enabled automatic node provisioning with real-time pod-driven scaling
✅ Used consolidation to automatically downsize and bin-pack underutilized nodes
✅ Allowed Karpenter to choose optimal instance types and sizes across Spot and On-Demand
✅ Tuned ttlSecondsAfterEmpty for faster scale-down of idle nodes
✅ Used taints, tolerations, and labels to isolate workloads by team/priority
✅ Integrated with Prometheus and CloudWatch for cost visibility and scaling metrics

✅ ~45% cost reduction by leveraging Spot instances dynamically
✅ Improved pod scheduling speed — workloads spin up in under a minute
✅ ~30% better resource utilization via bin-packing and consolidation
✅ Eliminated need for ASG complexity — no static instance mapping
✅ Highly responsive scaling based on actual pod needs
✅ Faster rollout of high-load jobs — CI/CD jobs no longer queue for resources
✅ Simplified ops — no more managing launch templates or capacity planning
✅ Flexible provisioning — different instance types per workload with minimal config

Last updated 9 months ago