Node faults | Harness Developer Hub

Best Practices for Probe Validation - Node Level Faults

This topic describes the best practices to use with resilience probes in Kubernetes node-level chaos faults.

Common node fault tunables

Fault tunables which are common to all the node faults are described here. These tunables can be provided at .spec.experiment[*].spec.components.env in the chaosengine.

Kubelet service kill

Kubelet service kill makes the application unreachable on the account of the node turning unschedulable (in NotReady state).

Node CPU hog

Node CPU hog exhausts the CPU resources on a Kubernetes node.

Node drain

Node drain drains the node of all its resources running on it. Due to this, services running on the target node should be rescheduled to run on other nodes.

Node IO stress

Node IO stress causes I/O stress on the Kubernetes node.

Node memory hog

Node memory hog causes memory resource exhaustion on the Kubernetes node.

Node network latency is a Kubernetes node-level chaos fault that induces packet latency across the entire node. Similar to pod network latency, this fault uses traffic control (tc) along with netem rules to inject network latency.

Node network loss

Node network loss is a Kubernetes node-level chaos fault that induces packet loss across the entire node. Similar to pod network loss, this fault uses traffic control (tc) along with netem rules to inject network loss.

Node restart

Node restart disrupts the state of the node by restarting it.

Node taint

Node taint taints the node by applying the desired effect. Only the resources that contain the corresponding toleration can bypass the taints.