NLB AZ down
The NLB (Network Load Balancer) AZ (Availability Zone) down fault triggers the unavailability of an AZ on a target network load balancer, resulting in potential disruptions to service delivery. This fault deliberately restricts access to specific availability zones by blocking the subnet ACL (Access Control List) for a defined duration. By simulating this scenario, you can assess the resilience and performance of your system when faced with an inaccessible AZ.
Use cases
- With this experiment, you can evaluate the application's behavior and assess its ability to handle and recover from a scenario where traffic from a particular AZ is blocked.
- It conducts an application test by deliberately blocking traffic originating from a specific AZ on the network load balancer. This experiment involves intentionally preventing incoming and outgoing traffic from the designated AZ from reaching the application through the load balancer.
Prerequisites
- Kubernetes >= 1.17
- ECS cluster running with the desired tasks and containers and familiarity with ECS service update and deployment concepts.
- Create a Kubernetes secret that has the AWS access configuration(key) in the
CHAOS_NAMESPACE
. Below is a sample secret file:
apiVersion: v1
kind: Secret
metadata:
name: cloud-secret
type: Opaque
stringData:
cloud_config.yml: |-
# Add the cloud AWS credentials respectively
[default]
aws_access_key_id = XXXXXXXXXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXX
HCE recommends that you use the same secret name, that is, cloud-secret
. Otherwise, you will need to update the AWS_SHARED_CREDENTIALS_FILE
environment variable in the fault template with the new secret name and you won't be able to use the default health check probes.
Below is an example AWS policy to execute the fault.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"elasticloadbalancing:DescribeLoadBalancers",
"ec2:DescribeSubnets",
"ec2:CreateNetworkAcl",
"ec2:CreateNetworkAclEntry",
"ec2:DescribeNetworkAcls",
"ec2:ReplaceNetworkAclAssociation",
"ec2:DeleteNetworkAcl"
],
"Resource": "*"
}
]
}
Mandatory tunables
Tunable | Description | Notes |
---|---|---|
LOAD_BALANCER_ARN | Target load balancer ARN whose AZ should be detached | For example, arn:aws:elasticloadbalancing:us-east-2:11111111111:loadbalancer/app/test-nlb/09121290906ffab7 . |
ZONES | Target zones that should be detached from the NLB | For example, us-east-1a . For more information, go to target zones. |
REGION | Region name for the target volumes | For example, us-east-1 . |
Optional tunables
Tunable | Description | Notes |
---|---|---|
TOTAL_CHAOS_DURATION | Duration to insert chaos (in seconds) | Default: 30 s. For more information, go to duration of the chaos. |
CHAOS_INTERVAL | Duration between the attachment and detachment of the volumes (in seconds) | Default: 30 s. For more information, go to chaos interval. |
AWS_SHARED_CREDENTIALS_FILE | Path to the AWS secret credentials. | Default: /tmp/cloud_config.yml . |
SEQUENCE | Sequence of chaos execution for multiple volumes | Default: parallel. Supports serial and parallel. For more information, go to sequence of chaos execution. |
RAMP_TIME | Duration to wait before and after injecting chaos (in seconds) | For example, 30 s. For more information, go to ramp time. |
Target zones
Comma-separated list of target zones. Tune it by using the ZONES
environment variable.
The following YAML snippet illustrates the use of this environment variable:
# contains nlb az down for given zones
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: nlb-az-down
spec:
components:
env:
# load balancer arn for chaos
- name: LOAD_BALANCER_ARN
value: 'arn:aws:elasticloadbalancing:us-east-2:11111111111:loadbalancer/app/test-nlb/09121290906ffab7'
# target zones for the chaos
- name: ZONES
value: 'us-east-1a,us-east-1b'
# region for chaos
- name: REGION
value: 'us-east-1'