Kubernetes Command Probe Templates
Pre-built Command Probe templates for validating Kubernetes resource health and status during chaos experiments. These templates help you quickly set up probes to monitor pods, nodes, containers, and other Kubernetes resources.
Here are Kubernetes probe templates that you can use in your chaos experiments.
Container Restart Check
Container restart check validates the restart count of a container.
Node Status Check
Node status check validates the current state of Kubernetes nodes.
Pod Replica Count Check
Pod replica count check validates the current replica count of Kubernetes pods.
Pod Resource Utilisation Check
Pod resource utilisation check validates the current resource utilisation metrics of Kubernetes pods.
Pod Startup Time Check
Pod startup time check validates the startup time of Kubernetes pods.
Pod Status Check
Pod status check validates the current state of Kubernetes pods.
Container Restart Check
Container restart check validates the restart count of a container.
Required Environment Variables:
TARGET_LABELS: Comma-separated list of target labels to filter podsTARGET_NAMES: Comma-separated list of target pod namesTARGET_NAMESPACE: Namespace of the target podsTARGET_CONTAINER: Name of the container to check restart countCONTAINER_RESTART: Maximum allowed restart count
Use cases
- Verify containers don't restart excessively during chaos experiments
- Monitor container stability during resource stress
- Validate application resilience to failures
- Ensure pods maintain healthy restart counts
Node Status Check
Node status check validates the current state of Kubernetes nodes.
Required Environment Variables:
TARGET_NODE: Comma-separated list of nodes to be checkedTARGET_NODES: Comma-separated list of nodes to be checkedNODE_LABEL: Node label to filter nodes (e.g.,node-role.kubernetes.io/worker=)
Use cases
- Verify nodes remain healthy during chaos experiments
- Validate node recovery after failures
- Monitor cluster health during node-level chaos
Pod Replica Count Check
Pod replica count check validates the current replica count of Kubernetes pods.
Required Environment Variables:
TARGET_LABELS: Comma-separated list of target labels to filter resourcesTARGET_NAMES: Comma-separated list of target resource namesTARGET_NAMESPACE: Namespace of the target resourcesTARGET_KIND: Kind of the target resource (e.g., deployment, statefulset)MINIMUM_HEALTHY_REPLICA_COUNT: Minimum healthy replica count for the target
Use cases
- Verify deployments maintain desired replica count
- Validate auto-scaling behavior during load chaos
- Monitor application availability during pod failures
- Ensure high availability during chaos experiments
Pod Resource Utilisation Check
Pod resource utilisation check validates the current resource utilisation metrics of Kubernetes pods.
Required Environment Variables:
TARGET_LABELS: Comma-separated list of target labels to filter podsTARGET_NAMES: Comma-separated list of target pod namesTARGET_NAMESPACE: Namespace of the target podsMETRIC_TYPE: Metric type to check (cpu or memory)CPU_LIMIT: CPU usage limit in millicoresMEMORY_LIMIT: Memory usage limit in MB
Use cases
- Monitor resource usage during stress chaos experiments
- Verify resource limits are respected
- Validate application performance under load
- Ensure pods don't exceed resource thresholds
Pod Startup Time Check
Pod startup time check validates the startup time of Kubernetes pods.
Required Environment Variables:
TARGET_LABELS: Comma-separated list of target labels to filter podsTARGET_NAMES: Comma-separated list of target pod namesTARGET_NAMESPACE: Namespace of the target podsSTARTUP_DURATION_CUTOFF: All pods should start within this duration (in seconds)
Use cases
- Validate pods start within acceptable timeframes
- Monitor deployment performance during rollouts
- Detect slow startup issues during chaos experiments
- Ensure application readiness times are optimal
Pod Status Check
Pod status check validates the current state of Kubernetes pods.
Required Environment Variables:
TARGET_LABELS: Comma-separated list of target labels to filter podsTARGET_NAMES: Comma-separated list of target pod namesTARGET_NAMESPACE: Namespace of the target pods
Use cases
- Verify pods remain in Running state during chaos experiments
- Validate pod health after failures and restarts
- Monitor application availability continuously
- Ensure pods recover to healthy state after disruptions
Pod Warnings Check
Pod warnings check checks for warnings in the pod events.
Required Environment Variables:
TARGET_LABELS: Comma-separated list of target labels to filter podsTARGET_NAMES: Comma-separated list of target pod namesTARGET_NAMESPACE: Namespace of the target pods
Use cases
- Monitor pod health indicators during chaos experiments
- Detect configuration issues during experiments
- Validate application behavior under stress
- Identify potential problems before they become critical