Time chaos
Time chaos is a Kubernetes pod-level fault that introduces controlled time offsets to disrupt the system time of the target pod.
Use cases
Time Chaos:
- Simulate scenarios where TLS certificates expire while the system is in operation. This allows them to assess how their applications, services, or infrastructure handle expired certificates in real-time.
- It is used to identify potential weaknesses in the system's ability to recover and handle time-related faults, leading to improvements in fault-tolerant designs and system resilience.
- It can be used in various simulations to mimic real-world scenarios where time synchronization or manipulation is critical.
Permissions required
Below is a sample Kubernetes role that defines the permissions required to execute the fault.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: hce
name: time-chaos
spec:
definition:
scope: Cluster # Supports "Namespaced" mode too
permissions:
- apiGroups: [""]
resources: ["pods"]
verbs: ["create", "delete", "get", "list", "patch", "deletecollection", "update"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "get", "list", "patch", "update"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["deployments, statefulsets"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["replicasets, daemonsets"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["chaosEngines", "chaosExperiments", "chaosResults"]
verbs: ["create", "delete", "get", "list", "patch", "update"]
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["create", "delete", "get", "list", "deletecollection"]
Prerequisites
- Kubernetes > 1.16 is required to execute this fault.
- The application pods should be in the running state before and after injecting chaos.
Mandatory tunables
Tunable | Description | Notes |
---|---|---|
OFFSET | Offset value used to modify the system time | Default: 3600s. For more information, go to offset. |
Optional tunables
Tunable | Description | Notes |
---|---|---|
TOTAL_CHAOS_DURATION | Duration for which to insert chaos (in seconds). | Default: 60 s. For more information, go to duration of the chaos. |
NODE_LABEL | Node label used to filter the target node if TARGET_NODE environment variable is not set. | It is mutually exclusive with the TARGET_NODE environment variable. If both are provided, the fault uses TARGET_NODE . For more information, go to node label. |
CLOCK_IDS | Comma separated clock ids of the target system clock | Default: CLOCK_REALTIME. For more information, go to offset. |
TARGET_CONTAINER | Name of the container subject to time chaos | If this value is not provided, the fault selects the first container of the target pod. For more information, go to target specific container |
CONTAINER_RUNTIME | Container runtime interface for the cluster | Default: containerd. Supports docker, containerd and crio. For more information, go to container runtime . |
SOCKET_PATH | Path of the containerd or crio or docker socket file. | Defaults to /run/containerd/containerd.sock . For more information, go to socket path. |
TARGET_PODS | Comma-separated list of application pod names subject to chaos | If not provided, the fault selects target pods randomly based on provided appLabels. For more information, go to target specific pods. |
PODS_AFFECTED_PERC | Percentage of total pods to target. Provide numeric values. | Default: 0 (corresponds to 1 replica). For more information, go to pod affected percentage. |
SEQUENCE | Sequence of chaos execution for multiple target pods. | Default: parallel. Supports serial and parallel. For more information, go to sequence of chaos execution. |
LIB_IMAGE | Image used to inject chaos. | Default: harness/chaos-go-runner:main-latest . For more information, go to image used by the helper pod. |
RAMP_TIME | Period to wait before and after injecting chaos (in seconds). | For example, 30 s. For more information, go to ramp time. |
Offset and Clock IDs
The OFFSET
and CLOCK_IDS
environment variables set the offset and clock ids, respectively.
OFFSET
: Offset value used to modify the system time. It should match with^(\d+)(ms|s|m|h)$
regex.CLOCK_IDS
: Comma separated clock ids of the target system clock. Refer to 'uapi/linux/time.h' for more details.
The following YAML snippet illustrates the use of this environment variable:
## provide the offset and clock ids
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: time-chaos
spec:
components:
env:
# time offset
- name: OFFSET
value: "3600s"
# clock ids of the target system
- name: CLOCK_IDS
value: "CLOCK_REALTIME"
- name: TOTAL_CHAOS_DURATION
VALUE: "60"
Container runtime and socket path
The CONTAINER_RUNTIME
and SOCKET_PATH
environment variables set the container runtime and socket file path, respectively.
CONTAINER_RUNTIME
: It supportsdocker
,containerd
, andcrio
runtimes. The default value iscontainerd
.SOCKET_PATH
: It contains path ofcontainerd
socket file by default(/run/containerd/containerd.sock
). Fordocker
, specify the path as/var/run/docker.sock
. Forcrio
, specify the path as/var/run/crio/crio.sock
.
The following YAML snippet illustrates the use of this environment variable:
## provide the container runtime and socket file path
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: time-chaos
spec:
components:
env:
# time offset
- name: OFFSET
value: "3600s"
# runtime for the container
# supports docker, containerd, crio
- name: CONTAINER_RUNTIME
value: "containerd"
# path of the socket file
- name: SOCKET_PATH
value: "/run/containerd/containerd.sock"
- name: TOTAL_CHAOS_DURATION
VALUE: "60"