EC2 HTTP status code
EC2 HTTP status code injects HTTP chaos that affects the request (or response) by modifying the status code (or the body or the headers) by starting a proxy server and redirecting the traffic through the proxy server.
Use cases
EC2 HTTP status code:
- Tests the application's resilience to erroneous code HTTP responses from the application server.
- Simulates unavailability of specific API services (503, 404).
- Simulates unavailability of specific APIs for (or from) a given microservice (TBD or Path Filter) (404).
- Simulates unauthorized requests for 3rd party services (401 or 403), and API malfunction (internal server error) (50x).
Prerequisites
- Kubernetes >= 1.17
- SSM agent is installed and running in the target EC2 instance.
- The EC2 instance should be in a healthy state.
- You can pass the VM credentials as secrets or as a
ChaosEngine
environment variable. - The Kubernetes secret should have the AWS Access Key ID and Secret Access Key credentials in the
CHAOS_NAMESPACE
. Below is the sample secret file:apiVersion: v1
kind: Secret
metadata:
name: cloud-secret
type: Opaque
stringData:
cloud_config.yml: |-
# Add the cloud AWS credentials respectively
[default]
aws_access_key_id = XXXXXXXXXXXXXXXXXXX
aws_secret_access_key = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
HCE recommends that you use the same secret name, that is, cloud-secret
. Otherwise, you will need to update the AWS_SHARED_CREDENTIALS_FILE
environment variable in the fault template with the new secret name and you won't be able to use the default health check probes.
Below is an example AWS policy to execute the fault.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ssm:GetDocument",
"ssm:DescribeDocument",
"ssm:GetParameter",
"ssm:GetParameters",
"ssm:SendCommand",
"ssm:CancelCommand",
"ssm:CreateDocument",
"ssm:DeleteDocument",
"ssm:GetCommandInvocation",
"ssm:UpdateInstanceInformation",
"ssm:DescribeInstanceInformation"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2messages:AcknowledgeMessage",
"ec2messages:DeleteMessage",
"ec2messages:FailMessage",
"ec2messages:GetEndpoint",
"ec2messages:GetMessages",
"ec2messages:SendReply"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstanceStatus",
"ec2:DescribeInstances"
],
"Resource": [
"*"
]
}
]
}
- Go to AWS named profile for chaos to use a different profile for AWS faults and the superset permission/policy to execute all AWS faults.
- Go to the common tunables to tune the common tunables for all the faults.
Mandatory tunables
Tunable | Description | Notes |
---|---|---|
EC2_INSTANCE_ID | ID of the target EC2 instance. | For example, i-044d3cb4b03b8af1f . For more information, go to EC2 instance ID. |
REGION | The AWS region ID where the EC2 instance has been created. | For example, us-east-1 . |
TARGET_SERVICE_PORT | Port of the service to target. | Default: port 80. For more information, go to target service port. |
STATUS_CODE | Modified status code for the HTTP response. | If no value is provided, then a random value is selected from the list of supported values. Multiple values can be provided as comma-separated, a random value from the provided list will be selected Supported values: [200, 201, 202, 204, 300, 301, 302, 304, 307, 400, 401, 403, 404, 500, 501, 502, 503, 504]. Defaults to random status code. For more information, go to modifying the response status code. |
MODIFY_RESPONSE_BODY | Whether to modify the body as per the status code provided. | If true, then the body is replaced by a default template for the status code. Default: True. |
Optional tunables
Tunable | Description | Notes |
---|---|---|
TOTAL_CHAOS_DURATION | Duration that you specify, through which chaos is injected into the target resource (in seconds). | Default: 30 s. For more information, go to duration of the chaos. |
CHAOS_INTERVAL | Time interval between two successive instance terminations (in seconds). | Default: 30 s. For more information, go to chaos interval. |
AWS_SHARED_CREDENTIALS_FILE | Provide the path for AWS secret credentials. | Default: /tmp/cloud_config.yml . |
SEQUENCE | It defines the sequence of chaos execution for multiple instances. | Default: parallel. Supports serial and parallel. For more information, go to sequence of chaos execution. |
RAMP_TIME | Period to wait before and after injection of chaos (in seconds). | For example, 30 s. For more information, go to ramp time. |
INSTALL_DEPENDENCY | Select to install dependencies used to run the network chaos. It can be either True or False. | If the dependency already exists, you can turn it off. Defaults to True. |
PROXY_PORT | Port where the proxy will be listening for requests. | Default: 20000. For more information, go to proxy port. |
TOXICITY | Percentage of HTTP requests to be affected. | Default: 100. For more information, go to toxicity. |
NETWORK_INTERFACE | Network interface to be used for the proxy. | Default: eth0 . For more information, go to network interface. |
Target service port
Port of the target service. Tune it by using the TARGET_SERVICE_PORT
environment variable.
The following YAML snippet illustrates the use of this environment variable:
## provide the port of the targeted service
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: ec2-http-status-code
spec:
components:
env:
# provide the port of the targeted service
- name: TARGET_SERVICE_PORT
value: "80"
Modifying the response status code
You can modify the status code of the response using the following example.
Note: HTTP_CHAOS_TYPE
should be provided as status_code
The following YAML snippet illustrates the use of this environment variable:
## provide the headers as a map
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: ec2-http-status-code
spec:
components:
env:
# modified status code for the http response
# if no value is provided, a random status code from the supported code list will selected
# if multiple comma-separated values are provided, then a random value
# from the provided list will be selected
# if an invalid status code is provided, the fault will fail
# supported status code list:
# [200, 201, 202, 204, 300, 301, 302, 304, 307, 400, 401, 403, 404, 500, 501, 502, 503, 504]
- name: STATUS_CODE
value: '500'
# whether to modify the body as per the status code provided
- name: "MODIFY_RESPONSE_BODY"
value: "true"
# provide the port of the targeted service
- name: TARGET_SERVICE_PORT
value: "80"
Proxy port
Port where the proxy server listens for requests. Tune it bby using the PROXY_PORT
environment variable.
The following YAML snippet illustrates the use of this environment variable:
# provide the port for proxy server
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: ec2-http-status-code
spec:
components:
env:
# provide the port for proxy server
- name: PROXY_PORT
value: '8080'
# provide the port of the targeted service
- name: TARGET_SERVICE_PORT
value: "80"
Toxicity
Percentage of the total number of HTTP requests that are affected. Tune it by using the TOXICITY
environment variable.
The following YAML snippet illustrates the use of this environment variable:
## provide the toxicity
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: ec2-http-status-code
spec:
components:
env:
# toxicity is the probability of the request to be affected
# provide the percentage value in the range of 0-100
# 0 means no request will be affected and 100 means all request will be affected
- name: TOXICITY
value: "100"
# provide the port of the targeted service
- name: TARGET_SERVICE_PORT
value: "80"
Network interface
Network interface used for the proxy. Tune it by using the NETWORK_INTERFACE
environment variable.
The following YAML snippet illustrates the use of this environment variable:
## provide the network interface for proxy
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
chaosServiceAccount: litmus-admin
experiments:
- name: ec2-http-status-code
spec:
components:
env:
# provide the network interface for proxy
- name: NETWORK_INTERFACE
value: "eth0"
# provide the port of the targeted service
- name: TARGET_SERVICE_PORT
value: '80'