Skip to main content

Pod network loss

Pod network loss is a Kubernetes pod-level chaos fault that causes packet loss in a specific container by starting a traffic control (tc) process with netem rules to add egress (or ingress) loss.

  • It tests the application's resilience to lossy (or flaky) network.

Pod Network Loss

Usage

View fault usage
It tests the application's resilience to lossy (or flaky) network. It simulates degraded network with varied percentages of dropped packets between microservices, loss of access to specific third party (or dependent) services (or components), blackhole against traffic to a given AZ (failure simulation of availability zones), and network partitions (split-brain) between peer replicas for a stateful application.

Prerequisites

  • Kubernetes> 1.16.

Default validations

The application pods should be in running state before and after chaos injection.

Fault tunables

Fault tunables

Optional fields

Variables Description Notes
NETWORK_INTERFACE Name of ethernet interface considered for shaping traffic
TARGET_CONTAINER Name of container which is subjected to network loss Optional Applicable for containerd & CRI-O runtime only. Even with these runtimes, if the value is not provided, it injects chaos on the first container of the pod
NETWORK_PACKET_LOSS_PERCENTAGE The packet loss in percentage Optional Default to 100 percentage
CONTAINER_RUNTIME container runtime interface for the cluster Defaults to docker, supported values: docker, containerd and crio for litmus and only docker for pumba LIB
SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to `/var/run/docker.sock`
TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s)
TARGET_PODS Comma separated list of application pod name subjected to pod network corruption chaos If not provided, it will select target pods randomly based on provided appLabels
DESTINATION_IPS IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated IP(S) or CIDR(S) can be provided. if not provided, it will induce network chaos for all ips/destinations
DESTINATION_HOSTS DNS Names/FQDN names of the services, the accessibility to which, is impacted if not provided, it will induce network chaos for all ips/destinations or DESTINATION_IPS if already defined
PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only
TC_IMAGE Image used for traffic control in linux default value is `gaiadocker/iproute2`
LIB_IMAGE Image used to run the netem command Defaults to `litmuschaos/go-runner:latest`
RAMP_TIME Period to wait before and after injection of chaos in sec For example, 30
SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

Fault examples

Common and pod-specific tunables

Refer to the common attributes and pod-specific tunables to tune the common tunables for all fault and pod specific tunables.

Network Packet Loss

It defines the network packet loss percentage to be injected in the targeted application. You can tune it using the NETWORK_PACKET_LOSS_PERCENTAGE ENV.

Use the following example to tune it:

# it injects network-loss for the egress traffic
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-network-loss
spec:
components:
env:
# network packet loss percentage
- name: NETWORK_PACKET_LOSS_PERCENTAGE
value: '100'
- name: TOTAL_CHAOS_DURATION
value: '60'

Destination IPs And Destination Hosts

The network faults interrupt traffic for all the IPs/hosts by default. The interruption of specific IPs/Hosts can be tuned via DESTINATION_IPS and DESTINATION_HOSTS ENV.

  • DESTINATION_IPS: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted.
  • DESTINATION_HOSTS: It contains the DNS Names/FQDN names of the services, the accessibility to which, is impacted.

Use the following example to tune it:

# it injects the chaos for the egress traffic for specific ips/hosts
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-network-loss
spec:
components:
env:
# supports comma separated destination ips
- name: DESTINATION_IPS
value: '8.8.8.8,192.168.5.6'
# supports comma separated destination hosts
- name: DESTINATION_HOSTS
value: 'nginx.default.svc.cluster.local,google.com'
- name: TOTAL_CHAOS_DURATION
value: '60'

Network Interface

The defined name of the ethernet interface, which is considered for shaping traffic. You can tune it using the NETWORK_INTERFACE ENV. Its default value is eth0.

Use the following example to tune it:

# provide the network interface
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-network-loss
spec:
components:
env:
# name of the network interface
- name: NETWORK_INTERFACE
value: 'eth0'
- name: TOTAL_CHAOS_DURATION
value: '60'

Container runtime and socket path

It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

  • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
  • SOCKET_PATH: It contains path of docker socket file by default(/var/run/docker.sock). For containerd, specify path as /var/containerd/containerd.sock. For crio, speecify path as /var/run/crio/crio.sock.

Use the following example to tune it:

## provide the container runtime and socket file path
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-network-loss
spec:
components:
env:
# runtime for the container
# supports docker, containerd, crio
- name: CONTAINER_RUNTIME
value: 'docker'
# path of the socket file
- name: SOCKET_PATH
value: '/var/run/docker.sock'
- name: TOTAL_CHAOS_DURATION
VALUE: '60'