AWS EC2 Instance Status Check

Last updated on Jul 1, 2026

AWS EC2 Instance Status Check is a built-in Command Probe template that validates the current state of one or more Amazon EC2 instances during a chaos experiment. It confirms that the target instances reach the expected running state, so you can assert that compute capacity stays healthy while a fault is injected. You select instances by instance ID or by tag, which makes the probe flexible across static fleets and dynamically scaled deployments.

The probe runs the healthchecks utility bundled in the chaos probe image, queries the Amazon EC2 API, and prints [Pass] when every targeted instance is in the expected state. The comparator marks the probe as passed when the output contains [Pass].

Built-in probe template

This is a built-in Command Probe template that runs on Kubernetes chaos infrastructure. Add it to an experiment from the probe library and customize its inputs. Go to Built-in probe templates to browse the full library, or go to Command probe to understand how command probes work.

Use cases

Use this probe template to:

Verify that EC2 instances stay in the running state while a fault disrupts the workload.
Validate instance state transitions after fault injection and recovery.
Monitor instance health in multi-AZ deployments.
Confirm that auto-scaling groups maintain a healthy set of instances.

How the probe works

The template configures a Command Probe that runs healthchecks -name aws-ec2. The utility resolves the target instances from EC2_INSTANCE_ID or EC2_INSTANCE_TAG in the supplied REGION, calls the Amazon EC2 DescribeInstances API, and prints [Pass] when every resolved instance is in the expected state. The comparator passes the probe when the output contains [Pass], and fails it otherwise.

Prerequisites

Chaos infrastructure: A Kubernetes chaos infrastructure with network access to the Amazon EC2 API endpoints.
AWS credentials: Cloud credentials available to the chaos infrastructure, with the permissions listed below.
Target instances exist: Every value in EC2_INSTANCE_ID or EC2_INSTANCE_TAG resolves to an instance in REGION.

Permissions required

The credentials used by the probe need the following AWS actions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances"
      ],
      "Resource": "*"
    }
  ]
}

The probe uses the AWS credentials available to your chaos infrastructure. Go to AWS IAM integration to set up access through IAM Roles for Service Accounts (IRSA), or go to common policy for all AWS faults to apply a single superset policy.

Probe properties

Command

healthchecks -name aws-ec2

Comparator

Type	Criteria	Value
string	contains	`[Pass]`

The probe passes when the command output contains [Pass], which indicates that every targeted EC2 instance is in the expected state.

Environment variables

Variable	Description	Required	Default
`EC2_INSTANCE_ID`	Comma-separated list of EC2 instance IDs to check (for example, `i-1234567890abcdef0,i-9876543210fedcba`). Provide this or `EC2_INSTANCE_TAG`.	Conditional	-
`EC2_INSTANCE_TAG`	Comma-separated list of EC2 instance tags to filter instances (for example, `Environment:prod,App:web`). Provide this or `EC2_INSTANCE_ID`.	Conditional	-
`REGION`	AWS region where the EC2 instances are located (for example, `us-east-1`, `eu-west-2`).	Yes	-

Instance selection

Provide at least one of EC2_INSTANCE_ID or EC2_INSTANCE_TAG. If you set both, EC2_INSTANCE_ID takes precedence and EC2_INSTANCE_TAG is ignored.

Run properties

Property	Description	Type	Default
`timeout`	Maximum time to wait for the probe to complete (for example, `30s`, `1m`, `5m`).	String	`300s`
`interval`	Time between probe executions (for example, `5s`, `30s`, `1m`).	String	`10s`
`attempt`	Number of retry attempts before the probe is marked as failed.	Integer	`1`
`pollingInterval`	Time between retry attempts (for example, `1s`, `5s`, `10s`).	String	-
`initialDelay`	Initial delay before the probe starts (for example, `0s`, `10s`, `30s`).	String	-
`stopOnFailure`	Stop the experiment if the probe fails.	Boolean	`false`
`verbosity`	Log verbosity level (`info`, `debug`, `trace`).	String	-

Troubleshooting

AWS EC2 Instance Status Check probe fails with an authorization or UnauthorizedOperation error

The credentials available to the chaos infrastructure do not have the required EC2 permissions in the target region. Confirm that the IAM policy attached to the role or IRSA service account includes ec2:DescribeInstances, and that any policy condition allows the region passed in REGION.

AWS EC2 Instance Status Check probe reports the instance was not found

The instance ID or tag did not resolve in the supplied REGION. Verify that the IDs in EC2_INSTANCE_ID are correct, that REGION exactly matches the instance region (for example us-east-1 vs us-east-2), and that the AWS account used by the credentials owns the instances.

AWS EC2 Instance Status Check probe times out before the instance reaches the expected state

The instance did not return to the expected state within the probe timeout. Increase the run-property timeout, raise the attempt count, or confirm that the fault recovery step actually restores the instance. Inspect the chaos pod logs to see the last observed instance state.

AWS ECS Service Status Check: Validate the desired state of an Amazon ECS service.
AWS Load Balancer AZ Check: Validate availability zones in an ALB or CLB.
Built-in probe templates: Browse the full probe template library.

Use cases​

How the probe works​

Prerequisites​

Permissions required​

Probe properties​

Command​

Comparator​

Environment variables​

Run properties​

Troubleshooting​

Related probe templates​