Skip to main content

10 docs tagged with "chaos-engineering"

View all tags

Common node fault tunables

Environment variables shared by node-level chaos faults for selecting target nodes by name, by label, or by percentage.

Kubelet service kill

Stop the kubelet on a Kubernetes node to simulate node loss without rebooting, and test eviction, rescheduling, and recovery behavior.

Node CPU hog

Exhaust CPU on a Kubernetes node to test scheduler behavior, pod eviction under pressure, HPA reactions, and noisy-neighbor isolation.

Node drain

Cordon and drain a Kubernetes node using the Eviction API to test PodDisruptionBudget enforcement, graceful shutdown, and rescheduling behavior.

Node I/O stress

Stress disk I/O on a Kubernetes node to test ephemeral-storage eviction, etcd write tolerance, log shipper backpressure, and noisy-neighbor isolation.

Node memory hog

Exhaust memory on a Kubernetes node to test kubelet eviction order, QoS-based pod prioritization, OOM behavior, and noisy-neighbor isolation.

Node network latency

Inject configurable network latency on a Kubernetes node's interface to test application timeouts, retry tuning, and tail-latency resilience.

Node network loss

Drop a configurable percentage of packets on a Kubernetes node's network interface to test cluster, application, and control-plane resilience.

Node restart

Reboot a Kubernetes node over SSH to test how the cluster handles sudden node loss, pod rescheduling, and stateful recovery.

Node taint

Apply a temporary taint to a Kubernetes node to test toleration correctness, scheduling policies, and NoExecute eviction behavior.