GitOps Meets Auto-Scaling: How ArgoCD and Karpenter Should Be Designed Together on EKS
ArgoCD and Karpenter are usually installed in the same cluster but designed in isolation. This post is about what happens when you stop treating them as two separate concerns and start designing them as one feedback loop. Mental model, anti-patterns, and real numbers from a 40-replica production workload.
Most EKS clusters I have audited in the last two years have ArgoCD installed and Karpenter installed. Almost none of them have those two things designed to work together. ArgoCD is set up by the platform team, Karpenter is set up later by the cost-optimization initiative, and the two run side by side without anyone owning the gap between them.
That gap is where you get the weird Sunday-night incidents. A new app rolls out, ArgoCD marks it as Healthy, Karpenter is busy consolidating the cluster, and 15% of your replicas end up Pending for 90 seconds. Or someone tunes consolidation aggressively to save cost, and the next deploy stalls because Karpenter killed the node that was about to receive the new pods.
This post is the mental model I wish someone had handed me four years ago when I started running both in production.
Table of contents
The setup that makes this post relevant
I run an EKS production cluster for a multi-tenant feedback platform. Until recently, the data plane was 23 m6a.large nodes, fixed size, managed node group, no autoscaler. The main API runs at 40 replicas, with HPA configured to scale between 4 and 150. The cluster runs 31 ArgoCD applications.
Two numbers tell most of the story. First, the deployment strategy was set to maxSurge: 0. That is not a typo. With a fixed node count and 40 large replicas, adding even one extra pod during a rollout would push CPU requests over what the cluster could schedule. So we terminated old pods first, then scheduled new ones, accepting a brief capacity dip on every deploy.
Second, the HPA was configured up to 150 replicas, but the realistic ceiling on 23 nodes was around 60 replicas before pods started going Pending. The HPA was theoretically correct and operationally meaningless above the ceiling.
This is the world Karpenter is supposed to fix. But it does not fix it on its own. It fixes it in the context of how ArgoCD is reconciling intent against the cluster.
The mental model: intent, reality, capacity
ArgoCD is a controller for intent. It reads Git, computes the desired state, and writes it to the Kubernetes API. From ArgoCD’s perspective, an app is Healthy when the live state matches the manifest in Git.
Karpenter is a controller for capacity. It watches Pending pods, computes what nodes would satisfy them, and provisions those nodes. From Karpenter’s perspective, the cluster is healthy when no pod is unschedulable.
The Kubernetes scheduler is the reality in between. It binds pods to nodes when nodes exist that satisfy resource requests, taints, affinities, and topology spread. If no node fits, the pod sits Pending and someone needs to do something about it.
Most platform teams design ArgoCD against intent and Karpenter against capacity, separately. The reality layer is implicit. When it works, nobody notices. When it fails, you spend Sunday on Slack.
The shift is to treat the three as one loop:
flowchart LR
GIT[("Git
desired state")] -->|reconciles| ARGOCD["ArgoCD
(intent)"]
ARGOCD -->|writes manifests| K8S["Kubernetes API"]
K8S -->|schedules pods| SCHED["Scheduler
(reality)"]
SCHED -->|pods Pending?| KARP["Karpenter
(capacity)"]
KARP -->|provisions nodes| NODES[("EC2 nodes")]
NODES -->|capacity available| SCHED
SCHED -->|pods Running| OBS["Healthy
app"]
OBS -.->|metrics| ARGOCD
Designing the platform means designing for what happens at every arrow.
Why “install both” is not a design
Both projects have excellent docs for installing themselves. Neither covers what happens when both are installed in the same cluster, because that is your problem.
Here are the questions that usually go unanswered:
- When ArgoCD bumps replica count from 40 to 80, who owns the latency until the new pods are Running?
- When Karpenter consolidates and removes a node, how does ArgoCD know whether to mark the app Degraded or Progressing?
- When the cluster needs an upgrade, do you drain through Karpenter disruption or through ArgoCD sync waves?
- When a NodePool runs out of capacity (spot interruption, account quota, AZ outage), what does the user-facing app see?
Each one of these is a design decision. If nobody made the decision, the system made it for you, badly.
The four interactions you have to design for
I have learned to think about the integration as four specific interactions. Each one needs an explicit answer.
1. Scale-up: how fast does intent become reality
When ArgoCD or HPA pushes more replicas, the new pods need somewhere to land. On a fixed cluster, “somewhere” exists or it does not. With Karpenter, “somewhere” is created on demand, and the question becomes: how long does that take, and what does the user see during the wait.
Karpenter typically takes 30 to 90 seconds to bring a new node online, depending on AMI, instance type, and cloud provider weather. During that window, pods are Pending. ArgoCD will mark the app as Progressing, not Degraded, which is fine. But your end user might see degraded latency if you scaled up because of a traffic spike.
The design decision: do you over-provision so scale-up is instant, or do you accept the lag and make sure your service can absorb it.
In my case, the answer was a small system NodePool with one or two warm nodes always available, plus a general NodePool that grows and shrinks aggressively. New replicas of the API land on warm capacity if it exists, on cold capacity otherwise. The HPA target is set conservatively (70% CPU) so we hit the threshold well before the saturation point, giving Karpenter time to react.
The anti-pattern: setting HPA target too high (90%) and assuming Karpenter will catch up. By the time the HPA fires, your nodes are already in trouble, and Karpenter cannot rescue you in 60 seconds.
2. Scale-down: who decides a node dies
This is where most teams lose money or stability, often both.
Karpenter has consolidation. ArgoCD has automated sync. The Kubernetes scheduler has PodDisruptionBudgets. Cluster Autoscaler had --scale-down-delay-after-add. Karpenter v1 has disruption budgets. There are a lot of dials, and they fight each other if you do not set them on purpose.
The design decision: what is the contract between Karpenter and your applications about disruption.
The honest answer for most production workloads is “Karpenter is allowed to remove nodes, but only one per workload at a time, and only if the workload has a PDB”. If you skip the PDB, Karpenter will happily evict every replica simultaneously when consolidating, and you will discover the gap in your release notes.
# This is the minimum every production Deployment should have.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api
spec:
maxUnavailable: 1
selector:
matchLabels:
app: api
For Karpenter, the matching contract:
# Disruption budget at the NodePool level, not the workload level.
spec:
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
budgets:
- nodes: "10%"
The combination of those two means: “Karpenter, you can disrupt up to 10% of nodes, and within that, no more than one replica of any given app at a time.” That is a design, not a coincidence.
flowchart TB KARP["Karpenter wants
to consolidate"] --> NB{"NodePool
budget
nodes: 10%"} NB -->|over budget| WAIT["wait"] NB -->|within budget| DRAIN["pick a node,
drain it"] DRAIN --> POD{"PDB on each
app says..."} POD -->|"maxUnavailable: 1
already at limit"| BLOCKED["pod stays,
node not removed"] POD -->|"within PDB"| EVICT["evict pod,
scheduler reschedules"] EVICT --> DONE["node removed,
workload preserved"]
The anti-pattern: enabling consolidation with default settings and no PDBs. Karpenter is doing what you asked. The problem is what you asked for.
3. Sync waves vs node lifecycle
ArgoCD has sync waves: a way to declare ordering across resources during a sync. They are useful for installing CRDs before the things that use them, or for rolling out a database migration before the app that depends on it.
Karpenter has its own lifecycle: nodes come up, get tainted, get untainted, become drainable, become deleted. It does not coordinate with ArgoCD sync waves in any way.
This usually does not matter. It starts to matter in two scenarios.
First, when an ArgoCD app is responsible for installing a DaemonSet that has to be on every node before workload pods can run (think Cilium, the EBS CSI driver, OpenTelemetry collector). If ArgoCD has not synced the DaemonSet to a freshly provisioned Karpenter node before the workload lands, you get racing tolerations and dropped traces or, worse, broken networking on that node for 30 seconds.
The fix is to keep critical DaemonSets out of ArgoCD entirely, install them via the cluster bootstrap (Helm + Terraform), and let ArgoCD handle the layer above. Karpenter brings up nodes, the bootstrap layer makes them functional, and ArgoCD lands the apps. Three layers, one direction.
Second, when you do a major cluster upgrade. ArgoCD has no idea you are upgrading the data plane. Karpenter is happy to roll nodes through disruption.expireAfter. If you trust the combination, you can run cluster upgrades without an ArgoCD sync, just by letting Karpenter recycle nodes onto a new AMI. That is a real and underused pattern. But it requires you to have done the design work above first.
4. Failure modes: when one of them is wrong
Both controllers can be wrong. Designs should account for both being wrong simultaneously.
Karpenter can be wrong about capacity. Spot interruption rates spike, an account hits an instance quota, a NodePool’s ami-family cannot be scheduled in the AZs you allow. Pods sit Pending.
ArgoCD can be wrong about intent. A Helm template change pushes a manifest that requires more memory than any node in any NodePool can supply. ArgoCD says “Healthy: false, Progressing”, and the cluster has 12 nodes ready and idle.
The design question is: who notices, and how fast.
The cheap answer is to alert on kube_pod_status_phase{phase="Pending"} > 0 for more than three minutes, alongside karpenter_nodepool_usage and an ArgoCD-app-degraded alert. The smart answer is to add a synthetic test that runs end to end against the deployed app, so when ArgoCD says “Healthy” but the user-facing path is broken, your monitoring tells you.
ArgoCD’s Healthy check is a lifecycle check on the resources, not a behavioral check on the app. Karpenter’s “no Pending pods” check is a scheduling check, not a “the app is serving traffic” check. Neither one is a substitute for synthetic monitoring of the actual user journey.
Anti-patterns I keep seeing
Anti-pattern 1: NodePool per application
Looks tidy on a diagram. Falls apart in operation. Karpenter’s value comes from consolidation across heterogeneous workloads. If every app gets its own NodePool, you give up consolidation, you fragment your spot pool, and you turn a one-controller decision (Karpenter) into a many-team negotiation (who gets to grow when).
The exception is workloads with hard isolation requirements: GPU pools, ARM-only pools, compliance-tainted pools. There the NodePool boundary maps to a real boundary. Otherwise, one or two NodePools with smart requirements blocks beats N NodePools every time.
Anti-pattern 2: Karpenter as a faster Cluster Autoscaler
Cluster Autoscaler is reactive. So is Karpenter, but Karpenter also rebalances. If you treat it as a faster CA and ignore consolidation, you leave most of the savings on the table. Conversely, if you treat consolidation as free, you cause disruption that your application is not designed to tolerate.
The mental shift: Karpenter is a continuous optimizer, not an emergency responder. Design for the steady state, then make sure your apps survive the optimizer doing its job.
Anti-pattern 3: ArgoCD selfHeal with no guardrails on Karpenter resources
ArgoCD’s selfHeal: true is great for application drift. It is dangerous when it tries to reconcile the live state of Karpenter’s CRDs against what is in Git. NodeClaim and Node resources are managed by Karpenter, not by you. If ArgoCD ever tries to delete a NodeClaim because it is “out of sync” with Git, you have a misconfiguration: those resources should not be tracked by ArgoCD in the first place.
Either keep Karpenter’s runtime CRDs out of any ArgoCD app, or scope the ArgoCD app’s source to only the configuration resources (NodePool, EC2NodeClass) and never the dynamically created ones.
Anti-pattern 4: HPA inside the same ArgoCD app as the Deployment
HPA writes to replicas on the Deployment. ArgoCD also writes to replicas on the Deployment if you put the value in Git. They will fight, and the symptom is your app oscillating between the manifested replica count and the HPA target every time ArgoCD does a sync.
flowchart LR
GIT[("Git
replicas: 4")] --> ARGO[ArgoCD]
HPA["HPA
target: CPU 70%"] -->|writes 12| DEP["Deployment
replicas: ?"]
ARGO -->|writes 4| DEP
DEP --> KARP[Karpenter
provisions for 12]
KARP -->|then| ARGO2["ArgoCD syncs again,
writes 4"]
ARGO2 -->|scales down to 4| DEP2["pods Pending
5 min later"]
DEP2 --> HPA2[HPA scales to 12 again]
HPA2 --> LOOP["oscillation forever"]
The fix is well-known but I still see it skipped: omit replicas from the Helm template (or set it to null), and let HPA own it entirely. ArgoCD has a sync-options: ServerSideApply=true option and per-resource ignoreDifferences that can help. In practice, removing the field from the chart is cleaner.
Anti-pattern 5: testing the integration only in staging with a static cluster
Staging clusters are usually small enough that Karpenter never has an interesting decision to make. You will not see consolidation pain, you will not see spot interruption pain, you will not see scale-up latency under load. Production is where these emerge.
The fix is twofold. First, make staging large enough or busy enough that Karpenter actually has to think (or use a chaos tool to force decisions). Second, accept that some integration bugs will only surface in production, and have a rollback plan that does not require both controllers to be healthy.
What I would design today, starting from scratch
If I were building this platform from zero in 2026, here is the shape I would aim for.
A small, fixed system NodePool with two m7a.large nodes, dedicated to ArgoCD, the metrics stack, and any DaemonSet that has to exist before workloads can. These nodes do not get consolidated. They live forever. They are tainted to keep workload pods off.
A general NodePool that runs everything else, with consolidation enabled, mixed on-demand and spot, multi-arch (Graviton + x86 if your images support it), allowed to grow to whatever the application’s HPA could ever require plus 20%.
A batch NodePool, only if you have batch workloads that are spot-tolerant and time-bounded. Otherwise skip it, it is one more thing to maintain.
flowchart TB
subgraph SYS["system NodePool: fixed, consolidation OFF"]
direction LR
S1[node-1
m7a.large]
S2[node-2
m7a.large]
S1 --- S2
SP1[ArgoCD]
SP2[Prometheus]
SP3[DaemonSets]
end
subgraph GEN["general NodePool: elastic, consolidation ON"]
direction LR
G1[Karpenter
provisions]
G2[mixed on-demand
+ spot]
G3[multi-arch
Graviton + x86]
end
APPS["application pods
api, reporter,
ingest, frontend"]
GEN --> APPS
SYS -.->|tainted, no workload pods| APPS
The matching NodePool YAML for the general pool. Note the disruption.budgets block, which is what makes consolidation safe in production.
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: general
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand", "spot"]
- key: kubernetes.io/arch
operator: In
values: ["amd64", "arm64"]
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
budgets:
- nodes: "10%"
ArgoCD installed via Helm + Terraform at bootstrap, not via ArgoCD itself. The 31-app pattern of “everything is an Application” only works once ArgoCD already exists. Pick what comes before.
Apps in ArgoCD with selfHeal: true, prune: true, and a revisionHistoryLimit that matches your rollback window. PDBs on every Deployment with replicas greater than one. HPA where it makes sense, with replicas removed from the manifest.
Alerts on Pending pods, on NodePool quota exhaustion, on ArgoCD app degradation, and on the user-facing synthetic. Each one tells you a different thing about whether the loop is closing.
That is the platform. The two controllers do not coordinate directly. They coordinate through the cluster they share, and through the design choices you made when you installed both.
What I would tell my past self
You do not need to understand ArgoCD and Karpenter to use them. You need to understand what they assume about each other. ArgoCD assumes the cluster has capacity for what Git says. Karpenter assumes any pod that is Pending should become a node. Neither one knows the other exists.
Your job, as the platform engineer, is to be the part of the system that knows. Design for the loop, not for the components. The components are well-documented. The loop is yours to draw.
