Skip to main content

Pod Status Check

Pod status check validates the current state of Kubernetes pods.

Infrastructure type

  • Kubernetes

Use cases

Pod Status Check probe helps you:

  • Verify pods remain in Running state during chaos experiments
  • Validate pod health after failures and restarts
  • Monitor application availability continuously
  • Ensure pods recover to healthy state after disruptions

Overview

This probe validates that pods are in the expected state (typically Running) during chaos experiments. It's one of the most fundamental probes for ensuring application availability.

Probe type

Command Probe

Prerequisites

  • Kubernetes cluster with chaos infrastructure installed
  • Access to target namespace and pods
  • Sufficient RBAC permissions to query pod status

Probe properties

Command

healthchecks -name pod-level

Comparator

TypeCriteriaValue
stringcontains[Pass]

The probe passes when the command output contains [Pass], indicating all specified pods are in a healthy Running state.

Environment variables

VariableDescriptionRequiredDefault
TARGET_LABELSComma-separated list of target labels to filter pods.No-
TARGET_NAMESComma-separated list of target pod names.No-
TARGET_NAMESPACENamespace of the target pods.No-
TARGET_KINDKind of the target resource (e.g., deployment, statefulset, daemonset).Nodeployment
STATUS_CHECK_TIMEOUTMaximum time in seconds to wait for status check.No180
STATUS_CHECK_DELAYDelay in seconds between status checks.No2

Run properties

PropertyDescriptionTypeDefault
timeoutMaximum time to wait for the probe to complete (e.g., 30s, 1m, 5m)String180s
intervalTime between probe executions (e.g., 1s, 5s, 10s)String1s
attemptNumber of retry attempts before marking the probe as failedInteger1
pollingIntervalTime between retry attempts (e.g., 1s, 5s, 10s)String-
initialDelayInitial delay before starting the probe (e.g., 0s, 10s, 30s)String-
stopOnFailureStop the experiment if the probe failsBooleanfalse
verbosityLog verbosity level (info, debug, trace)String-

Probe definition

You can define this probe in your chaos experiment as follows:

Using pod labels

probe:
- name: "pod-health-check"
type: "cmdProbe"
mode: "Continuous"
cmdProbe/inputs:
command: "healthchecks -name pod-level"
comparator:
type: "string"
criteria: "contains"
value: "[Pass]"
env:
- name: TARGET_LABELS
value: "app=nginx,tier=frontend"
- name: TARGET_NAMESPACE
value: "production"
- name: STATUS_CHECK_TIMEOUT
value: "180"
- name: STATUS_CHECK_DELAY
value: "2"
runProperties:
timeout: 180s
interval: 1s
attempt: 1
stopOnFailure: false

Using pod names

probe:
- name: "specific-pod-status-check"
type: "cmdProbe"
mode: "Edge"
cmdProbe/inputs:
command: "healthchecks -name pod-level"
comparator:
type: "string"
criteria: "contains"
value: "[Pass]"
env:
- name: TARGET_NAMES
value: "my-app-pod-1,my-app-pod-2"
- name: TARGET_NAMESPACE
value: "default"
- name: STATUS_CHECK_TIMEOUT
value: "120"
runProperties:
timeout: 150s
interval: 2s
attempt: 3