EBS loss by tag
EBS loss by tag disrupts the state of EBS volume by detaching it from the node (or EC2) instance using volume ID for a certain duration.
- In case of EBS persistent volumes, the volumes can self-attach and the re-attachment step can be skipped.
- It tests the deployment sanity (replica availability and uninterrupted service) and recovery workflows of the application pod.
Usage
View fault usage
Prerequisites
- Kubernetes > 1.16.
- Adequate AWS access to attach or detach an EBS volume for the instance.
- Create a Kubernetes secret that has the AWS access configuration(key) in the
CHAOS_NAMESPACE
. A sample secret file looks like:
apiVersion: v1
kind: Secret
metadata:
name: cloud-secret
type: Opaque
stringData:
cloud_config.yml: |-
# Add the cloud AWS credentials respectively
[default]
aws_access_key_id = XXXXXXXXXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXX
It is recommended to use the same secret name, i.e.
cloud-secret
. Otherwise, you will need to update theAWS_SHARED_CREDENTIALS_FILE
environment variable in the fault template and you may be unable to use the default health check probes.Refer to AWS Named Profile For Chaos to know how to use a different profile for AWS faults.
Permissions required
Here is an example AWS policy to execute the fault.
View policy for the fault
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:AttachVolume",
"ec2:DetachVolume"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ec2:DescribeVolumes",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstanceStatus",
"ec2:DescribeInstances"
],
"Resource": "*"
}
]
}
Refer to the superset permission/policy to execute all AWS faults.
Default validations
EBS volume is attached to the instance.
Fault tunables
Fault tunables
Mandatory fields
Variables | Description | Notes |
---|---|---|
EBS_VOLUME_TAG | Provide the common tag for target volumes. It'll be in form of key:value (Ex: 'team:devops') | |
REGION | The region name for the target volumes | For example, us-east-1 . |
Optional fields
Variables | Description | Notes |
---|---|---|
VOLUME_AFFECTED_PERC | The Percentage of total EBS volumes to target | Defaults to 0 (corresponds to 1 volume), provide numeric value only |
TOTAL_CHAOS_DURATION | The time duration for chaos insertion (sec) | Defaults to 30s |
CHAOS_INTERVAL | The time duration between the attachment and detachment of the volumes (sec) | Defaults to 30s |
SEQUENCE | It defines sequence of chaos execution for multiple volumes | Default value: parallel. Supported: serial, parallel |
RAMP_TIME | Period to wait before and after injection of chaos in sec | For example, 30 |
Fault examples
Common and AWS-specific tunables
Refer to the common attributes and AWS-specific tunables to tune the common tunables for all faults and aws specific tunables.
Target single volume
It will detach a random single EBS volume with the given EBS_VOLUME_TAG
tag and REGION
region.
Use the following example to tune this:
# contains the tags for the EBS volumes
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: ebs-loss-by-tag
spec:
components:
env:
# tag of the EBS volume
- name: EBS_VOLUME_TAG
value: 'key:value'
# region for the EBS volume
- name: REGION
value: 'us-east-1'
- name: TOTAL_CHAOS_DURATION
VALUE: '60'
Target percent of volumes
It will detach the VOLUME_AFFECTED_PERC
percentage of EBS volumes with the given EBS_VOLUME_TAG
tag and REGION
region.
Use the following example to tune it:
# target percentage of the EBS volumes with the provided tag
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: ebs-loss-by-tag
spec:
components:
env:
# percentage of EBS volumes filter by tag
- name: VOLUME_AFFECTED_PERC
value: '100'
# tag of the EBS volume
- name: EBS_VOLUME_TAG
value: 'key:value'
# region for the EBS volume
- name: REGION
value: 'us-east-1'
- name: TOTAL_CHAOS_DURATION
VALUE: '60'