Skip to main content

Chaos Faults

The fault execution is triggered when the chaosengine resource (various examples of which are provided under the respective faults) is created. Typically, these chaosengines are embedded within the 'steps' of a chaos fault. However, you can also create the chaosengine manually, and the chaos-operator reconciles this resource and triggers the fault execution.

Provided below are tables with links to the individual fault docs for easy navigation.

Kubernetes Faults

Kubernetes faults disrupt the resources running on a cluster. They can be categorized into pod-level faults and node-level faults.

Pod Chaos

Fault NameDescriptionUser Guide
Container KillKills the container in the application podcontainer-kill
Disk FillFill up Ephemeral Storage of a Resourceddisk-fill
Pod AutoscalerScales the application replicas and test the node autoscaling on clusterpod-autoscaler
Pod CPU Hog ExecConsumes CPU resources on the application container by invoking a utility within the app container base imagepod-cpu-hog-exec
Pod CPU HogConsumes CPU resources on the application containerpod-cpu-hog
Pod DeleteDeletes the application pods. pod-delete
Pod DNS ErrorDisrupt dns resolution in kubernetes popod-dns-error
Pod DNS SpoofSpoof dns resolution in kubernetes podpod-dns-spoof
Pod IO StressInjects IO stress resources on the application containerpod-io-stress
Pod Memory Hog ExecConsumes Memory resources on the application container by invoking a utility within the app container base imagepod-memory-hog-exec
Pod Memory HogConsumes Memory resources on the application containerpod-memory-hog
Pod Network CorruptionInjects Network Packet Corruption into Application Podpod-network-corruption
Pod Network DuplicationInjects Network Packet Duplication into Application Podpod-network-duplication
Pod Network LatencyInjects Network latency into Application Podpod-network-latency
Pod Network LossInjects Network loss into Application Podpod-network-loss
Pod HTTP LatencyInjects HTTP latency into Application Podpod-http-latency
Pod HTTP Reset PeerInjects HTTP reset peer into Application Podpod-http-reset-peer
Pod HTTP Status CodeInjects HTTP status code chaos into Application Podpod-http-status-code
Pod HTTP Modify BodyInjects HTTP modify body into Application Podpod-http-modify-body
Pod HTTP Modify HeaderInjects HTTP Modify Header into Application Podpod-http-modify-header

Node Chaos

Fault NameDescriptionUser Guide
Docker Service KillKills the docker service on the application nodedocker-service-kill
Kubelet Service KillKills the kubelet service on the application nodekubelet-service-kill
Node CPU HogExhaust CPU resources on the Kubernetes node. node-cpu-hog
Node DrainDrains the target nodenode-drain
Node IO StressInjects IO stress resources on the application nodenode-io-stress
Node Memory HogExhaust Memory resources on the Kubernetes Nodenode-memory-hog
Node Restart Restarts the target nodenode-restart
Node TaintTaints the target nodenode-taint

Cloud Infrastructure

Chaos faults that inject chaos into Kubernetes platform resources are classified in this category. Platform management for these resources differs significantly. You can maintain chaos charts separately for each platform (For example, AWS, GCP, Azure, etc).

The below mentioned platform chaos faults are available:

AWS

Fault NameDescriptionUser Guide
EC2 Stop By IDStop EC2 instances using the instance IDs.ec2-stop-by-id
EC2 Stop By TagStop the EC2 instance using the instance tagec2-stop-by-tag
EBS Loss By IDDetach the EBS volume using the volume idebs-loss-by-id
EBS Loss By TagDetach the EBS volume using the volume tagebs-loss-by-tag
EC2 CPU HogInject CPU stress chaos on EC2 instanceec2-cpu-hog
EC2 Memory HogInject Memory stress chaos on EC2 instanceec2-memory-hog
EC2 IO StressInject IO stress chaos on EC2 instanceec2-io-stress
EC2 HTTP LatencyInject HTTP latency for services running on the EC2 instances.ec2-http-latency
EC2 HTTP Reset PeerInject connection reset for services running on the EC2 instances.ec2-http-reset-peer
EC2 HTTP Status CodeModifies HTTP response status code for services running on the EC2 instances.ec2-http-status-code
EC2 HTTP Modify BodyModifies HTTP response body for services running on the EC2 instances.ec2-http-modify-body
EC2 HTTP Modify HeaderModifies HTTP request or response headers for services running on the EC2 instances.ec2-http-modify-header
EC2 Network LossInjects network loss on the target EC2 instance(s)ec2-network-loss
EC2 Network LatencyInjects network latency on the target EC2 instance(s)ec2-network-latency
EC2 Dns ChaosInjects dns faults on the target EC2 instance(s)ec2-dns-chaos
ECS Task StopInjects task stop chaos on ECS tasksecs-task-stop
ECS Container CPU HogInjects container cpu hog chaos on ECS task containersecs-container-cpu-hog
ECS Container IO StressInjects container IO stress chaos on ECS task containersecs-container-io-stress
ECS Container Memory HogInjects container memory hog chaos on ECS task containersecs-container-memory-hog
EC2 Container Network LatencyInjects container network latency chaos on ECS task containersecs-container-network-latency
EC2 Container Network LossInjects container network latency chaos on ECS task containersecs-container-network-loss
EC2 Agent StopInjects ECS agent stop chaos on target ECS clusterecs-agent-stop
EC2 Instance StopInjects ECS instance stop chaos on target ECS clusterecs-instance-stop
RDS Instance DeleteInjects RDS instance delete chaos on target RDS instance/clusterrds-instance-delete
RDS Instance RebootInjects RDS instance reboot chaos on target RDS instance/clusterrds-instance-reboot
Lambda Delete Event Source MappingInject chaos to delete event source mapping of target lambda functionlambda-delete-event-source-mapping
Lambda Toggle Event Mapping StateInject chaos to toggle event mapping state of target lambda functionlambda-toggle-event-mapping-state
Lambda Update Function MemoryInject chaos to update the lambda function memory limitlambda-update-function-memory
Lambda Update Function TimeoutInject chaos to update the lambda function timeout valuelambda-update-function-timeout
Lambda Update Role PermissionInject chaos to update (or change) the role attached to the Lambda function lambda-update-role-permission
Lambda Delete Function ConcurrencyInject chaos to delete the reserved concurrency of the Lambda functionlambda-delete-function-concurrency

GCP

Fault NameDescriptionUser Guide
GCP VM Instance StopStop GCP VM instances using the VM names.gcp-vm-instance-stop
GCP VM Disk LossDetach the GCP diskgcp-vm-disk-loss
GCP VM Instance Stop By LabelStop GCP VM instances using label selectorsgcp-vm-instance-stop-by-label
GCP VM Disk Loss By LabelDetach the GCP disk using label selectorsgcp-vm-disk-loss-by-label

Azure

Fault NameDescriptionUser Guide
Azure Instance StopStop Azure VM instances.azure-instance-stop
Azure Instance CPU HogInject CPU stress chaos on Azure instanceazure-instance-cpu-hog
Azure Instance Memory HogInject Memory stress chaos on Azure instanceazure-instance-memory-hog
Azure Instance IO StressInject IO stress chaos on Azure instanceazure-instance-io-stress
Azure Disk LossDetach Azure disk from instanceazure-disk-loss
Azure Web App StopStops an Azure web app serviceazure-web-app-stop
Azure Web App Access RestrictAdd access restriction for the target web app serviceazure-web-app-access-restrict

VMWare

Fault NameDescriptionUser Guide
VMware VM PoweroffPoweroff VMware VMs using the MOIDsvmware-vmpoweroff
VMWare DNS ChaosInjects DNS errors on the VMWare VM(s).vmware-dns-chaos
VMWare Network LossInjects network loss on the target VM(s).vmware-network-loss
VMWare Network LatencyInjects network latency on the target VM(s)vmware-network-latency
VMware HTTP LatencyAdd HTTP Latency to the services running on the VM(s).vmware-http-latency
VMware HTTP Reset PeerSimulate connection lost for HTTP requests on the services running on the VM(s).vmware-http-reset-peer
VMware HTTP Modify ResponseModify HTTP Response on services running on the VM(s).vmware-http-modify-response
VMware VM Process killKill the processes running in the VMware VMs using the PROCESS_IDSvmware-process-kill
VMware VM Cpu HogVMware CPU hog fault consumes the CPU resources on Linux OS based VMware VMvmware-cpu-hog
VMware VM Memory HogVMware memory hog fault consumes the Memory resources on Linux OS based VMware VMvmware-memory-hog
VMware VM IO StressThis fault causes disk stress on the target VMware VMs.vmware-io-stress
VMware VM Service StopVMware Service Stop fault stops the target systemd services running on Linux OS based VMware VMvmware-service-stop
VMware VM Disk LossVMware Disk Loss fault will detach the disks attached to a Linux OS based VMware VM.vmware-disk-loss
VMware Host RebootVMware Host Reboot fault reboots a VMware host attached to the Vcentervmware-host-reboot

Kube Resilience

Fault NameDescriptionUser Guide
Kubelet DensityCheck kubelet resilience for a specific nodekubelet-density