Skip to main content

AWS EC2 Instance Status Check

Validates the current state of an EC2 instance. This probe checks if the specified EC2 instance(s) are in the expected state. At least one of EC2_INSTANCE_ID or EC2_INSTANCE_TAG must be provided.

Infrastructure type

  • Kubernetes

Use cases

AWS EC2 Instance Status Check probe helps you:

  • Verify EC2 instances remain running during chaos experiments
  • Validate instance state transitions after fault injection
  • Monitor instance health in multi-AZ deployments
  • Ensure auto-scaling groups maintain healthy instances

Overview

This probe uses the AWS CLI to query EC2 instance status and validates that instances are in the expected state. It supports filtering by instance ID or tags, making it flexible for various deployment scenarios.

Probe type

Command Probe

Prerequisites

  • Kubernetes cluster with chaos infrastructure installed
  • AWS credentials configured with appropriate IAM permissions:
    • ec2:DescribeInstances
    • ec2:DescribeInstanceStatus
  • Network connectivity to AWS API endpoints
  • Target EC2 instances should be accessible from the chaos infrastructure

Probe properties

Command

healthchecks -name aws-ec2

Comparator

TypeCriteriaValue
stringcontains[Pass]

The probe passes when the command output contains [Pass], indicating all specified EC2 instances are in the expected state.

Environment variables

VariableDescriptionRequiredDefault
EC2_INSTANCE_IDComma-separated list of EC2 instance IDs to check (e.g., i-1234567890abcdef0,i-9876543210fedcba). Either this or EC2_INSTANCE_TAG must be provided.No*-
EC2_INSTANCE_TAGComma-separated list of EC2 instance tags to filter instances (e.g., Environment:prod,App:web). Either this or EC2_INSTANCE_ID must be provided.No*-
REGIONAWS region where the EC2 instances are located (e.g., us-east-1, eu-west-2).Yes-

Note: At least one of EC2_INSTANCE_ID or EC2_INSTANCE_TAG must be provided.


Run properties

PropertyDescriptionTypeDefault
timeoutMaximum time to wait for the probe to complete (e.g., 30s, 1m, 5m)String300s
intervalTime between probe executions (e.g., 5s, 30s, 1m)String10s
attemptNumber of retry attempts before marking the probe as failedInteger1
pollingIntervalTime between retry attempts (e.g., 1s, 5s, 10s)String-
initialDelayInitial delay before starting the probe (e.g., 0s, 10s, 30s)String-
stopOnFailureStop the experiment if the probe failsBooleanfalse
verbosityLog verbosity level (info, debug, trace)String-

Probe definition

You can define this probe in your chaos experiment as follows:

Using instance IDs

probe:
- name: "ec2-instance-health-check"
type: "cmdProbe"
mode: "Continuous"
cmdProbe/inputs:
command: "healthchecks -name aws-ec2"
comparator:
type: "string"
criteria: "contains"
value: "[Pass]"
env:
- name: EC2_INSTANCE_ID
value: "i-1234567890abcdef0,i-9876543210fedcba"
- name: REGION
value: "us-east-1"
runProperties:
timeout: 300s
interval: 10s
attempt: 1
stopOnFailure: false

Using instance tags

probe:
- name: "ec2-tagged-instances-check"
type: "cmdProbe"
mode: "Edge"
cmdProbe/inputs:
command: "healthchecks -name aws-ec2"
comparator:
type: "string"
criteria: "contains"
value: "[Pass]"
env:
- name: EC2_INSTANCE_TAG
value: "Environment:production,App:backend"
- name: REGION
value: "eu-west-1"
runProperties:
timeout: 60s
interval: 5s
attempt: 3
pollingInterval: 2s