Node memory hog
Node memory hog is a Kubernetes node-level chaos fault that consumes a configurable share of a target node's memory for a configurable duration. As free memory drops, the kubelet's eviction thresholds trip and pods are evicted in QoS order: BestEffort first, then Burstable (the pods that exceed their request the most), then Guaranteed only as a last resort.
Use this fault to simulate a memory-leak neighbor: a runaway batch process, a JVM heap that grew past its node-level budget, or a container that ignores its memory limit and consumes whatever the kernel gives it.
If you have not configured the chaos infrastructure yet, go to Quickstart to install the chaos infrastructure and run an experiment end to end.
Use cases
Run this fault when you want to answer concrete questions like:
- QoS-based eviction order: When the kubelet starts evicting, do
Guaranteedworkloads stay protected whileBestEffortand over-quotaBurstablepods are reclaimed first? - Pod priority and preemption: Does pod priority correctly influence which workloads survive an eviction sweep?
- HPA and VPA reactions: When the memory footprint of a service grows because the node is under pressure, do autoscaling controllers add capacity in time?
- OOM kill behavior at the container level: Does the application restart cleanly after an OOM kill, or does it leave behind leaked state (open file handles, half-written files, orphaned children)?
- Restart-loop and crash-loop containment: Does a single OOM kill stay contained, or does it cascade into a
CrashLoopBackOffbecause the new pod immediately hits the same constraint? - Memory-leak detection in monitoring: Do your alerts fire on the right signal (working set growth, eviction rate, OOM count), and at the right severity?
Prerequisites
- Kubernetes version: 1.21 or later. Go to What's supported to confirm distribution support.
- Privileged pods allowed: The cluster lets you schedule privileged pods in the chaos namespace. The fault allocates memory against the host.
- Container runtime access: The chaos infrastructure can reach the container runtime on the target nodes. The default
containerdsocket path is mounted automatically. - Node readiness: Target nodes are in
Readystate before the fault is launched. The fault reports a precheck failure otherwise. - Workloads have memory requests and limits: Without memory requests, the kubelet cannot reason about QoS class, every pod is treated as
BestEffort, and the experiment observes generic eviction noise rather than meaningful prioritization. - Chaos infrastructure isolation: The target nodes are not single points of failure for the chaos infrastructure itself. If chaos control-plane pods are scheduled on the saturated node and end up evicted, the experiment loses observability.
Supported environments
| Platform | Support status |
|---|---|
| Amazon EKS | Supported |
| Azure AKS | Supported |
| Google GKE | Supported |
| Red Hat OpenShift | Supported |
| Rancher | Supported |
| VMware Tanzu | Supported |
| Self-managed Kubernetes (CNCF-certified) | Supported |
| GKE Autopilot | Not supported (Autopilot does not expose the node-level access this fault requires; only Node Network Loss and Node Network Latency are allowlisted, see Chaos on GKE Autopilot) |
Permissions required
The fault runs under the chaos infrastructure's service account. The account must be able to perform the following operations against the target cluster.
Resource (apiGroup) | Verbs | Why it is needed |
|---|---|---|
pods ("") | get, list, create, delete, deletecollection, patch, update | Run the chaos pod that injects memory pressure on the target node |
pods/log ("") | get, list, watch | Stream chaos pod logs for status and debugging |
events ("") | get, list, create, patch, update | Record fault progress and any pod evictions as Kubernetes events |
nodes ("") | get, list | Discover target nodes and validate selectors |
jobs (batch) | get, list, create, delete, deletecollection | Run the chaos job that drives the fault |
The default Harness chaos infrastructure service account already includes these permissions. You only need to extend it if you are running with a restricted scope.
Fault tunables
Configure the following fault parameters when you add Node memory hog to an experiment in Chaos Studio. Defaults are shown for reference.
Chaos parameters
| Tunable | Description | Default |
|---|---|---|
MEMORY_CONSUMPTION_PERCENTAGE | Memory to consume as a percentage of the node's total capacity. When non-zero, it takes precedence over MEMORY_CONSUMPTION_MEBIBYTES. | 0 |
MEMORY_CONSUMPTION_MEBIBYTES | Memory to consume as an absolute value in MiB. Used when MEMORY_CONSUMPTION_PERCENTAGE is 0 (the default). | 500 |
NUMBER_OF_WORKERS | Number of VM workers used to allocate memory. More workers reach the target faster but use more CPU. | 1 |
TOTAL_CHAOS_DURATION | Duration of the fault in seconds. | 30 |
Targeting
| Tunable | Description | Default |
|---|---|---|
TARGET_NODES | Comma-separated list of node names to target. Go to target multiple nodes to read more. | "" |
NODE_LABEL | Label selector for choosing target nodes. Go to target nodes with labels to read more. | "" |
NODES_AFFECTED_PERCENTAGE | Percentage of nodes (matching the selector) to target. 0 means one node. | 0 |
SEQUENCE | When multiple nodes are targeted, inject parallel (all at once) or serial (one after another). | parallel |
Runtime and helper
| Tunable | Description | Default |
|---|---|---|
RAMP_TIME | Wait period in seconds before and after the fault. Go to ramp time to read how it is applied. | 0 |
Tunables that apply to every chaos fault are documented in common tunables for all faults.
MEMORY_CONSUMPTION_PERCENTAGE and MEMORY_CONSUMPTION_MEBIBYTES are mutually exclusive. The default configuration consumes 500 MiB (the MEMORY_CONSUMPTION_PERCENTAGE default of 0 cedes precedence to the absolute value). To consume a percentage of node memory instead, set MEMORY_CONSUMPTION_PERCENTAGE to a non-zero value; start at 30% to 50% on production-shaped nodes because higher values cross kubelet eviction thresholds quickly.
Fault execution in brief
Allocates a specified percentage of the target node's memory for the configured duration, so workloads sharing the node experience kubelet eviction or container OOM kills once the kubelet's memory-pressure threshold is crossed.
The kubelet eviction manager ranks pods for eviction in this order:
| Eviction order | What is reclaimed |
|---|---|
BestEffort pods (no memory request) | Reclaimed first. Cheapest cost to the cluster. |
Burstable pods using more than their memory request | Reclaimed next, ranked by how far over request they are. |
Guaranteed pods (memory request = limit) | Reclaimed last, only when the kernel itself is about to OOM. |
| OOM killer | Fires inside individual containers that exceed their per-container memory limit. |
The kubelet emits Evicted events naming the evicted pod and the eviction signal. Watch for them with kubectl get events --field-selector reason=Evicted.
Expected behavior during fault execution
- Memory consumed by the fault is added on top of whatever the node was already using. A 30% setting on a node already at 60% utilization pushes the node to 90% and likely trips kubelet eviction thresholds.
- The kubelet evicts whole pods, not individual containers. Once a pod is evicted, the scheduler tries to place it on another node with capacity.
- Application containers that hit their own memory limit are OOM-killed by the kernel and counted in
kube_pod_container_status_restarts_total. This is independent of node eviction. - If
NUMBER_OF_WORKERSis high, memory is allocated faster but consumes more CPU. For most experiments, the default1is enough; raise it only if you want to reach the target memory consumption in the first few seconds. - The node almost never flips to
NotReadyfrom memory pressure alone. Eviction is the expected outcome, not partition.
This fault tests how the cluster handles a memory-saturated node. To test how a single pod handles hitting its container memory limit specifically, use Pod memory hog. The mechanisms and observed signals are different.
Signals to watch
A useful experiment captures signals from three layers. Attach resilience probes to assert each layer automatically:
- Cluster state and eviction: Run
kubectl top node <name>andkubectl get events --field-selector reason=Evicted -n <namespace> -wto see eviction in real time. Use a Kubernetes probe to validate that critical pods stay scheduled andRunning. - Application service-level indicators: Watch error rate and request availability for the affected workloads. The signal that matters is whether QoS protected the right pods. Use an HTTP probe for direct endpoint health.
- Eviction and OOM metrics: Track
kube_pod_status_reason{reason="Evicted"},node_memory_MemAvailable_bytes, andkube_pod_container_status_restarts_totalfor OOM-driven restarts. Use a Prometheus probe or an APM probe to fail the experiment when an unexpected pod is evicted or when restart counts spike.
Verify the fault execution effect
While the experiment is running, confirm that memory pressure is reaching the node:
-
Check memory usage on the node.
kubectl top node <target-node>Memory usage should rise toward the percentage you configured. If it stays flat, the fault is not driving memory pressure on the expected node.
-
Watch for eviction events.
kubectl get events --field-selector reason=Evicted --all-namespaces -wAt higher consumption levels you should see
Evictedevents listingMemoryPressureas the reason. If no evictions occur, either the node had plenty of free memory orMEMORY_CONSUMPTION_PERCENTAGEwas set too low to cross the kubelet's eviction threshold. -
Look for OOM kills in pods that breached their own limit.
kubectl get pods --field-selector spec.nodeName=<target-node> -o widekubectl describe pod <restarted-pod> | grep -A3 'Last State'Reason: OOMKilledandExit Code: 137indicate a container-level OOM, separate from kubelet eviction.
Recovery and cleanup
- End of duration: When
TOTAL_CHAOS_DURATIONelapses, the allocation is freed and node memory returns to baseline within a few seconds. - Evicted pods reschedule: Pods that were evicted during the fault are scheduled on other Ready nodes by the scheduler. They are not placed back on the recovered node automatically.
- Pods stuck
Pending: If your cluster lacks capacity on other nodes, evicted pods may sit inPending. The cluster autoscaler should add capacity if configured. Otherwise, the pods land back on the recovered node only after another scheduling cycle. - Container OOM restarts: Containers that were OOM-killed during the fault are restarted by the kubelet. If a container hits its limit again immediately on restart, it can enter
CrashLoopBackOff. Investigate and raise the per-container memory limit before re-running. - If automated cleanup did not complete: Memory is reclaimed as soon as the chaos pod exits. No node-level cleanup is required.
- Abort the experiment early: Stop the experiment from Harness Chaos Studio. Memory is reclaimed once the chaos pod exits.
Limitations
This fault is not appropriate in the following scenarios:
- Serverless Kubernetes (EKS Fargate, ACI virtual nodes, GKE Autopilot): These platforms do not expose real nodes or allow the privileged access this fault needs.
- Windows nodes: This fault is supported on Linux nodes only.
- Single-node clusters or co-located chaos infrastructure: If the chaos infrastructure pods live on the node you are about to saturate, the kubelet may evict them along with everything else, and the experiment loses observability. Schedule chaos infrastructure on a node outside the blast radius.
- Workloads without memory requests: Without requests, every pod is
BestEffortand the experiment observes generic eviction rather than meaningful QoS prioritization. - Very large consumption values on small nodes: Setting
MEMORY_CONSUMPTION_PERCENTAGEclose to 100% on a node with less headroom than the chaos pod needs can OOM the chaos pod itself before it produces useful signal. Start at 30% to 50% and tune from there.
Troubleshooting
Node memory hog experiment stays Pending or never starts in Harness Chaos Engineering
Inspect the chaos pods in the experiment namespace with kubectl describe pod -n <chaos-namespace>. The most common causes are taints on the target node, insufficient memory available to schedule the chaos pod, or a PodSecurity admission policy blocking privileged pods. Add the required tolerations, free resources on the node, or run the experiment in a namespace with privileged Pod Security level.
Node memory hog runs but kubectl top shows memory usage unchanged on the target node
The chaos pod may be constrained by its own memory limit, or it may be scheduled on a different node than expected. Verify with kubectl get pods -n <chaos-namespace> -o wide that the chaos pod is on the intended target node, and remove or raise its memory limit if it is constrained.
No pods are evicted during node-memory-hog even at high MEMORY_CONSUMPTION_PERCENTAGE
The kubelet's eviction thresholds (memory.available, nodefs.available) may not be set, or the node simply has enough free memory left after the hog. Check kubelet config on the node with kubectl get --raw /api/v1/nodes/<node>/proxy/configz | jq .evictionHard. Raise the consumption value or lower the eviction threshold (in a non-production cluster) to reproduce eviction reliably.
Helper pod is killed with OOMKilled during node-memory-hog instead of evicting other pods
The chaos pod hit its own container memory limit before the kubelet reached its node-level eviction threshold. Lower MEMORY_CONSUMPTION_MEBIBYTES so the chaos pod stays within bounds, or raise the chaos pod's memory limit. Inspect the chaos pod in the experiment namespace with kubectl describe pod -n <chaos-namespace> to confirm the last state shows OOMKilled and Exit Code 137.
Critical Guaranteed pods are evicted during node-memory-hog
Guaranteed pods should be the last to be evicted. If they are reclaimed first, verify their QoS class with kubectl describe pod <name> | grep QoS. The most common cause is that their memory request and limit are not exactly equal, downgrading them to Burstable. Set request == limit for both memory and CPU on critical pods.
Related faults
- Node CPU hog: Same blast-radius shape but applies CPU pressure instead of memory. Use it to test throttling rather than eviction.
- Node I/O stress: Stresses disk I/O on the node. Use it to test disk-bound workloads.
- Pod memory hog: Scope memory pressure to a single pod rather than the whole node. Use it to test per-container OOM behavior.
- Common node fault tunables: Shared environment variables for selecting target nodes across node faults.