Windows EC2 CPU hog
EC2 windows CPU hog induces CPU stress on the AWS Windows EC2 instances using Amazon SSM Run command. The SSM Run command is executed using SSM documentation that is built into the fault. This fault:
- Causes CPU chaos on the target AWS Windows EC2 instances using the given
EC2_INSTANCE_ID
environment variable for a specific duration.
Use cases
EC2 windows CPU hog:
- Simulates the situation of a lack of CPU for processes running on the instance, which degrades their performance.
- Simulates slow application traffic or exhaustion of the resources, leading to degradation in the performance of processes on the instance.
note
- Kubernetes >= 1.17 is required to execute this fault.
- The EC2 instance should be in a healthy state.
- SSM agent should be installed and running on the target EC2 instance.
- SSM IAM role should be attached to the target EC2 instance(s).
- Kubernetes secret should have the AWS Access Key ID and Secret Access Key credentials in the
CHAOS_NAMESPACE
. Below is a sample secret file:
apiVersion: v1
kind: Secret
metadata:
name: cloud-secret
type: Opaque
stringData:
cloud_config.yml: |-
# Add the cloud AWS credentials respectively
[default]
aws_access_key_id = XXXXXXXXXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- If you change the secret key name (from
experiment.yml
), ensure that you update theAWS_SHARED_CREDENTIALS_FILE
environment variable in the chaos experiment with the new name.
Here is an example AWS policy to execute the fault.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ssm:GetDocument",
"ssm:DescribeDocument",
"ssm:GetParameter",
"ssm:GetParameters",
"ssm:SendCommand",
"ssm:CancelCommand",
"ssm:CreateDocument",
"ssm:DeleteDocument",
"ssm:GetCommandInvocation",
"ssm:UpdateInstanceInformation",
"ssm:DescribeInstanceInformation"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2messages:AcknowledgeMessage",
"ec2messages:DeleteMessage",
"ec2messages:FailMessage",
"ec2messages:GetEndpoint",
"ec2messages:GetMessages",
"ec2messages:SendReply"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstanceStatus",
"ec2:DescribeInstances"
],
"Resource": [
"*"
]
}
]
}
- Refer to AWS Named Profile for chaos to use a different profile for AWS faults, and the superset permission/policy to execute all AWS faults.
Fault tunables
Mandatory fields
Variables | Description | Notes |
---|---|---|
EC2_INSTANCE_ID | ID(s) of the target EC2 instances. | For example: i-044d3cb4b03b8af1f . |
REGION | AWS region ID where the EC2 instance has been created. | For example: us-east-1 . |
Optional fields
Variables | Description | Notes |
---|---|---|
TOTAL_CHAOS_DURATION | Duration to insert chaos (in seconds). | Defaults to 30s. |
AWS_SHARED_CREDENTIALS_FILE | Path to the AWS secret credentials. | Defaults to /tmp/cloud_config.yml . |
CPU_CORE | Number of CPU cores to consume. | Defaults to 0 which means all available CPU cores would be consumed. |
SEQUENCE | Sequence of chaos execution for multiple instances. | Default value is parallel. Supports serial and parallel. |
RAMP_TIME | Period to wait before and after injecting chaos (in seconds). | For example, 30s. |
CPU core
It specifies the number of CPU cores to be utilized on the EC2 instance. Tune it by using the CPU_CORE
environment variable. All available CPU cores can be consumed by setting this variable to 0
.
Use the following example to tune the CPU core:
# CPU cores to utilize
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: windows-ec2-cpu-hog
spec:
components:
env:
- name: CPU_CORE
VALUE: '2'
# ID of the EC2 instance
- name: EC2_INSTANCE_ID
value: 'instance-1'
# region for the EC2 instance
- name: REGION
value: 'us-east-1'
Multiple EC2 instances
It specifies multiple EC2 instances as comma-separated IDs that are targeted in one chaos run. Tune it by using the EC2_INSTANCE_ID
environment variable.
Use the following example to tune multiple EC2 instances:
# mutilple instance targets
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: windows-ec2-cpu-hog
spec:
components:
env:
# ids of the EC2 instances
- name: EC2_INSTANCE_ID
value: 'instance-1,instance-2,instance-3'
# region for the EC2 instance
- name: REGION
value: 'us-east-1'