Skip to main content

Pod Replica Count Check

Pod replica count check validates the current replica count of Kubernetes pods.

Infrastructure type

  • Kubernetes

Use cases

Pod Replica Count Check probe helps you:

  • Verify deployments maintain desired replica count
  • Validate auto-scaling behavior during load chaos
  • Monitor application availability during pod failures
  • Ensure high availability during chaos experiments

Overview

This probe validates that Kubernetes resources (Deployments, StatefulSets, DaemonSets, etc.) maintain the minimum required number of healthy replicas during chaos experiments.

Probe type

Command Probe

Prerequisites

  • Kubernetes cluster with chaos infrastructure installed
  • Access to target namespace and resources
  • Sufficient RBAC permissions to query resource status

Probe properties

Command

healthchecks -name validate-pod-replica

Comparator

TypeCriteriaValue
stringcontains[Pass]

The probe passes when the command output contains [Pass], indicating the resource has the minimum required healthy replicas.

Environment variables

VariableDescriptionRequiredDefault
TARGET_LABELSComma-separated list of target labels to filter resources.No-
TARGET_NAMESComma-separated list of target resource names.No-
TARGET_NAMESPACENamespace of the target resources.No-
TARGET_KINDKind of the target resource (e.g., deployment, statefulset, daemonset).Nodeployment
MINIMUM_HEALTHY_REPLICA_COUNTMinimum healthy replica count for the target.No1
STATUS_CHECK_TIMEOUTMaximum time in seconds to wait for status check.No180
STATUS_CHECK_DELAYDelay in seconds between status checks.No2

Run properties

PropertyDescriptionTypeDefault
timeoutMaximum time to wait for the probe to complete (e.g., 30s, 1m, 5m)String180s
intervalTime between probe executions (e.g., 1s, 5s, 10s)String1s
attemptNumber of retry attempts before marking the probe as failedInteger1
pollingIntervalTime between retry attempts (e.g., 1s, 5s, 10s)String-
initialDelayInitial delay before starting the probe (e.g., 0s, 10s, 30s)String-
stopOnFailureStop the experiment if the probe failsBooleanfalse
verbosityLog verbosity level (info, debug, trace)String-

Probe definition

You can define this probe in your chaos experiment as follows:

Using resource labels

probe:
- name: "replica-count-validation"
type: "cmdProbe"
mode: "Continuous"
cmdProbe/inputs:
command: "healthchecks -name validate-pod-replica"
comparator:
type: "string"
criteria: "contains"
value: "[Pass]"
env:
- name: TARGET_LABELS
value: "app=nginx,tier=frontend"
- name: TARGET_NAMESPACE
value: "production"
- name: TARGET_KIND
value: "deployment"
- name: MINIMUM_HEALTHY_REPLICA_COUNT
value: "3"
runProperties:
timeout: 180s
interval: 1s
attempt: 1
stopOnFailure: false

Using resource names

probe:
- name: "deployment-replica-check"
type: "cmdProbe"
mode: "Edge"
cmdProbe/inputs:
command: "healthchecks -name validate-pod-replica"
comparator:
type: "string"
criteria: "contains"
value: "[Pass]"
env:
- name: TARGET_NAMES
value: "my-deployment,my-statefulset"
- name: TARGET_NAMESPACE
value: "default"
- name: MINIMUM_HEALTHY_REPLICA_COUNT
value: "2"
- name: STATUS_CHECK_TIMEOUT
value: "120"
runProperties:
timeout: 150s
interval: 2s
attempt: 3