EC2 CPU hog
EC2 CPU hog disrupts the state of infrastructure resources. It induces stress on the AWS ECS container using Amazon SSM Run command, which is carried out using SSM docs which is in-built into the fault.
- It causes CPU chaos on the containers of the ECS task using the given
CLUSTER_NAME
environment variable for a specific duration.
Usage
View fault usage
Prerequisites
- Kubernetes >= 1.17
- SSM agent is installed and running on the target EC2 instance.
- Create a Kubernetes secret that has the AWS Access Key ID and Secret Access Key credentials in the
CHAOS_NAMESPACE
. A sample secret file looks like:
apiVersion: v1
kind: Secret
metadata:
name: cloud-secret
type: Opaque
stringData:
cloud_config.yml: |-
# Add the cloud AWS credentials respectively
[default]
aws_access_key_id = XXXXXXXXXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- If you change the secret key name (from
experiment.yml
), ensure that you update theAWS_SHARED_CREDENTIALS_FILE
environment variable in the chaos experiment with the new name.
Permissions required
Here is an example AWS policy to execute the fault.
View policy for the fault
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ssm:GetDocument",
"ssm:DescribeDocument",
"ssm:GetParameter",
"ssm:GetParameters",
"ssm:SendCommand",
"ssm:CancelCommand",
"ssm:CreateDocument",
"ssm:DeleteDocument",
"ssm:GetCommandInvocation",
"ssm:UpdateInstanceInformation",
"ssm:DescribeInstanceInformation"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2messages:AcknowledgeMessage",
"ec2messages:DeleteMessage",
"ec2messages:FailMessage",
"ec2messages:GetEndpoint",
"ec2messages:GetMessages",
"ec2messages:SendReply"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstanceStatus",
"ec2:DescribeInstances"
],
"Resource": [
"*"
]
}
]
}
Refer to the superset permission/policy to execute all AWS faults.
Default validations
The EC2 instance should be in a healthy state.
Fault tunables
Fault tunables
Mandatory fields
Variables | Description | Notes |
---|---|---|
EC2_INSTANCE_ID | ID of the target EC2 instance | For example: i-044d3cb4b03b8af1f |
REGION | The AWS region ID where the EC2 instance has been created | For example: us-east-1 |
Optional fields
Variables | Description | Notes |
---|---|---|
TOTAL_CHAOS_DURATION | The total time duration for chaos injection (sec) | Defaults to 30s |
CHAOS_INTERVAL | The interval (in sec) between successive chaos injection | Defaults to 60s |
AWS_SHARED_CREDENTIALS_FILE | Provide the path for aws secret credentials | Defaults to /tmp/cloud_config.yml |
INSTALL_DEPENDENCIES | Select to install dependencies used to run the CPU chaos. It can be either True or False | Defaults to True |
CPU_CORE | Provide the number of CPU cores to consume | Defaults to 0 |
CPU_LOAD | Provide the percentage of a single CPU core to be consumed | Defaults to 100 |
SEQUENCE | It defines sequence of chaos execution for multiple instance | Default value: parallel. Supported: serial, parallel |
RAMP_TIME | Period to wait before and after injecting chaos (in seconds). | For example, 30s. |
Fault examples
Fault tunables
Refer to the common attributes to tune the common tunables for all the faults.
CPU core
It defines the CPU core value to be utilised on the EC2 instance. You can tune it using the CPU_CORE
environment variable.
Use the following example to tune it:
# CPU cores to utilize
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: ec2-cpu-hog
spec:
components:
env:
- name: CPU_CORE
VALUE: '2'
# ID of the EC2 instance
- name: EC2_INSTANCE_ID
value: 'instance-1'
# region for the EC2 instance
- name: REGION
value: 'us-east-1'
CPU percentage
It defines the CPU percentage value to be utilised on the EC2 instance. You can tune it using the CPU_LOAD
environment variable.
Use the following example to tune it:
# CPU percentage to utilize
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: ec2-cpu-hog
spec:
components:
env:
- name: CPU_LOAD
VALUE: '50'
# ID of the EC2 instance
- name: EC2_INSTANCE_ID
value: 'instance-1'
# region for the EC2 instance
- name: REGION
value: 'us-east-1'
Multiple EC2 instances
Multiple EC2 instances can be targeted in one chaos run. You can tune it using the EC2_INSTANCE_ID
environment variable.
Use the following example to tune it:
# mutilple instance targets
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: ec2-cpu-hog
spec:
components:
env:
# ids of the EC2 instances
- name: EC2_INSTANCE_ID
value: 'instance-1,instance-2,instance-3'
# region for the EC2 instance
- name: REGION
value: 'us-east-1'
CPU core with percentage consumption
It defines how many CPU cores to utilise with percentage of utilisation on the EC2 instance. You can tune it using the CPU_CORE
and CPU_LOAD
environment variables, respectively.
Use the following example to tune it:
# CPU core with percentage to utilize
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: ec2-cpu-hog
spec:
components:
env:
- name: CPU_CORE
VALUE: '2'
- name: CPU_LOAD
VALUE: '50'
# ID of the EC2 instance
- name: EC2_INSTANCE_ID
value: 'instance-1'
# region for the EC2 instance
- name: REGION
value: 'us-east-1'