Container kill
Container kill is a Kubernetes pod-level chaos fault that causes container failure on specific or random replicas of an application resource.
Use cases
Container kill:
- Tests an application's deployment sanity (replica availability and uninterrupted service) and recovery workflow when certain replicas are not available.
- Tests the recovery of pods that possess sidecar containers.
Permissions required
Below is a sample Kubernetes role that defines the permissions required to execute the fault.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: hce
name: container-kill
spec:
definition:
scope: Cluster # Supports "Namespaced" mode too
permissions:
- apiGroups: [""]
resources: ["pods"]
verbs: ["create", "delete", "get", "list", "patch", "deletecollection", "update"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "get", "list", "patch", "update"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["deployments, statefulsets"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["replicasets, daemonsets"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["chaosEngines", "chaosExperiments", "chaosResults"]
verbs: ["create", "delete", "get", "list", "patch", "update"]
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["create", "delete", "get", "list", "deletecollection"]
Prerequisites
- Kubernetes > 1.16
- The application pods should be in the running state before and after injecting chaos.
Optional tunables
Tunable | Description | Notes |
---|---|---|
TARGET_CONTAINER | Name of the container that is killed. | If it is not provided, the fault deletes the first container. For more information, go to kill specific container |
CHAOS_INTERVAL | Time interval between two successive container kills (in seconds). | Default: 10 s. For more information, go to chaos interval |
TOTAL_CHAOS_DURATION | Duration for which to insert chaos (in seconds). | Default: 20 s. For more information, go to duration of the chaos |
PODS_AFFECTED_PERC | Percentage of total pods to target. It takes numeric values only. | Default: 0 (corresponds to 1 replica). For more information, go to pod affected percentage |
TARGET_PODS | Comma-separated list of application pod names subject to container kill. | If it is not provided, target pods are randomly based on appLabels provided. For more information, go to target specific pods |
NODE_LABEL | Node label used to filter the target node if TARGET_NODE environment variable is not set. | It is mutually exclusive with the TARGET_NODE environment variable. If both are provided, the fault uses TARGET_NODE . For more information, go to node label. |
RAMP_TIME | Period to wait before injecting chaos (in seconds). | For example, 30 s. For more information, go to ramp time |
SEQUENCE | Sequence of chaos execution for multiple target pods. | Default value: parallel. Supports serial and parallel. For more information, go to sequence of chaos execution |
SIGNAL | Termination signal used for container kill. | Defaults to SIGKILL . For more information, go to signal for kill |
SOCKET_PATH | Path to the containerd/crio/docker socket file. | Default /run/containerd/containerd.sock . For more information, go to socket path |
CONTAINER_RUNTIME | Container runtime interface for the cluster. | Default: containerd. Supports docker, containerd and crio. For more information, go to container runtime |
Kill specific container
Name of the target container. Tune it by using the TARGET_CONTAINER
environment variable. If TARGET_CONTAINER
environment variable is set to empty, the fault uses the first container of the target pod.
The following YAML snippet illustrates the use of this environment variable:
# kill the specific target container
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: container-kill
spec:
components:
env:
# name of the target container
- name: TARGET_CONTAINER
value: "nginx"
- name: TOTAL_CHAOS_DURATION
VALUE: "60"
Container runtime and socket path
The CONTAINER_RUNTIME
and SOCKET_PATH
environment variables to set the container runtime and socket file path, respectively.
CONTAINER_RUNTIME
: It supportsdocker
,containerd
, andcrio
runtimes. The default value iscontainerd
.SOCKET_PATH
: It contains path of containerd socket file by default(/run/containerd/containerd.sock
). Fordocker
, specify path as/var/run/docker.sock
. Forcrio
, specify path as/var/run/crio/crio.sock
.
The following YAML snippet illustrates the use of these environment variables:
## provide the container runtime and socket file path
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: container-kill
spec:
components:
env:
# runtime for the container
# supports docker, containerd, crio
- name: CONTAINER_RUNTIME
value: "containerd"
# path of the socket file
- name: SOCKET_PATH
value: "/run/containerd/containerd.sock"
- name: TOTAL_CHAOS_DURATION
VALUE: "60"
Signal for kill
Linux signal passed when killing the container. Its default value is set to SIGTERM
. Tune it by using the SIGNAL
environment variable.
The following YAML snippet illustrates the use of this environment variable:
# specific linux signal passed while kiiling container
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: container-kill
spec:
components:
env:
# signal passed while killing container
# defaults to SIGTERM
- name: SIGNAL
value: "SIGKILL"
- name: TOTAL_CHAOS_DURATION
VALUE: "60"