Namespace considerations

Last updated on Jul 10, 2025

This section discusses the steps required to run chaos experiments on specific namespaces.

Kubernetes clusters are vast, multi-layered, and customizable. A simple misconfiguration or vulnerability introduced by libraries poses a threat to the Kubernetes environment. These threats allow users with malicious intent to gather information about your environment without being detected, gain backdoor access to your application, or escalate privileges to steal secrets.

You can allow your chaos experiments to run in specific namespaces, thereby limiting the exposure of all the services (including business-critical and potentially sensitive workflows) of your application.

If your application fails, the above-mentioned technique also helps pin-point the exact cluster, namespace, and services in your Kubernetes environment that failed.

Step 1. Install CE in cluster mode

Install CE in cluster mode with the given installation manifest.
Restrict hce service account to certain target namespaces.

Step 2. Delete `hce` ClusterRole and ClusterRoleBinding

Once CE is up and running in cluster mode, delete the hce ClusterRole and ClusterRoleBinding to restrict the chaos scope in all namespaces.

$> kubectl delete clusterrole hce

clusterrole.rbac.authorization.k8s.io "hce" deleted

$> kubectl delete clusterrolebinding hce

clusterrolebinding.rbac.authorization.k8s.io "hce" deleted

Step 3. Create Role and RoleBinding in all target namespaces

This allows specific namespaces for chaos operations. Create Role and RoleBinding in these specific namespaces (say namespaceA and namespaceB).

The manifest for role-1.yaml in namespaceA will look as shown below.

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: namespaceA-chaos
  namespace: litmus
  labels:
    name: namespaceA-chaos
    app.kubernetes.io/part-of: litmus
rules:
  # Create and monitor the experiment & helper pods
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["create","delete","get","list","patch","update", "deletecollection"]
  # Performs CRUD operations on the events inside chaos engine and chaos result
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create","get","list","patch","update"]
  # Fetch configmaps details and mount it to the experiment pod (if specified)
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["get","list",]
  # Track and get the runner, experiment, and helper pods log
  - apiGroups: [""]
    resources: ["pods/log"]
    verbs: ["get","list","watch"]
  # for creating and monitoring liveness services or monitoring target app services during chaos injection
  - apiGroups: [""]
    resources: ["services"]
    verbs: ["create", "get", "list"]
  # for creating and managing to execute commands inside target container
  - apiGroups: [""]
    resources: ["pods/exec", "pods/eviction", "replicationcontrollers"]
    verbs: ["get", "list", "create"]
  # for checking the app parent resources as deployments or sts and are eligible chaos candidates
  - apiGroups: ["apps"]
    resources: ["deployments", "statefulsets"]
    verbs: ["list", "get", "patch", "update"]
  # deriving the parent/owner details of the pod(if parent is deploymentConfig)
  - apiGroups: ["apps.openshift.io"]
    resources: ["deploymentconfigs"]
    verbs: ["list","get"]
  # deriving the parent/owner details of the pod(if parent is deploymentConfig)
  - apiGroups: [""]
    resources: ["replicationcontrollers"]
    verbs: ["get","list"]
  # deriving the parent/owner details of the pod(if parent is argo-rollouts)
  - apiGroups: ["argoproj.io"]
    resources: ["rollouts"]
    verbs: ["list","get"]
  # for configuring and monitor the experiment job by the chaos-runner pod
  - apiGroups: ["batch"]
    resources: ["jobs"]
    verbs: ["create","list","get","delete","deletecollection"]
  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow
  - apiGroups: ["litmuschaos.io"]
    resources: ["chaosengines","chaosexperiments","chaosresults"]
    verbs: ["create","list","get","patch","update","delete"]
  # for checking the app parent resources as replicasets and are eligible chaos candidates
  - apiGroups: ["apps"]
    resources: ["replicasets"]
    verbs: ["list", "get"]
  # performs CRUD operations on the network policies
  - apiGroups: ["networking.k8s.io"]
    resources: ["networkpolicies"]
    verbs: ["create","delete","list","get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: namespaceA-chaos
  namespace: litmus
  labels:
    name: namespaceA-chaos
    app.kubernetes.io/part-of: litmus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: namespaceA-chaos
subjects:
- kind: ServiceAccount
  name: litmus-admin
  namespace: litmus

The manifest for role-2.yaml in namespaceB will look as shown below.

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: namespaceB-chaos
  namespace: namespaceB
  labels:
    name: namespaceB-chaos
    app.kubernetes.io/part-of: litmus
rules:
  # Create and monitor the experiment & helper pods
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["create","delete","get","list","patch","update", "deletecollection"]
  # Performs CRUD operations on the events inside chaos engine and chaos result
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create","get","list","patch","update"]
  # Fetch configmaps details and mount it to the experiment pod (if specified)
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["get","list",]
  # Track and get the runner, experiment, and helper pods log
  - apiGroups: [""]
    resources: ["pods/log"]
    verbs: ["get","list","watch"]
  # for creating and monitoring liveness services or monitoring target app services during chaos injection
  - apiGroups: [""]
    resources: ["services"]
    verbs: ["create", "get", "list"]
  # for creating and managing to execute commands inside target container
  - apiGroups: [""]
    resources: ["pods/exec", "pods/eviction", "replicationcontrollers"]
    verbs: ["get", "list", "create"]
  # for checking the app parent resources as deployments or sts and are eligible chaos candidates
  - apiGroups: ["apps"]
    resources: ["deployments", "statefulsets"]
    verbs: ["list", "get", "patch", "update"]
  # deriving the parent/owner details of the pod(if parent is deploymentConfig)
  - apiGroups: ["apps.openshift.io"]
    resources: ["deploymentconfigs"]
    verbs: ["list","get"]
  # deriving the parent/owner details of the pod(if parent is deploymentConfig)
  - apiGroups: [""]
    resources: ["replicationcontrollers"]
    verbs: ["get","list"]
  # deriving the parent/owner details of the pod(if parent is argo-rollouts)
  - apiGroups: ["argoproj.io"]
    resources: ["rollouts"]
    verbs: ["list","get"]
  # for configuring and monitor the experiment job by the chaos-runner pod
  - apiGroups: ["batch"]
    resources: ["jobs"]
    verbs: ["create","list","get","delete","deletecollection"]
  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow
  - apiGroups: ["litmuschaos.io"]
    resources: ["chaosengines","chaosexperiments","chaosresults"]
    verbs: ["create","list","get","patch","update","delete"]
  # for checking the app parent resources as replicasets and are eligible chaos candidates
  - apiGroups: ["apps"]
    resources: ["replicasets"]
    verbs: ["list", "get"]
  # performs CRUD operations on the network policies
  - apiGroups: ["networking.k8s.io"]
    resources: ["networkpolicies"]
    verbs: ["create","delete","list","get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: namespaceB-chaos
  namespace: namespaceB
  labels:
    name: namespaceB-chaos
    app.kubernetes.io/part-of: litmus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: namespaceB-chaos
subjects:
- kind: ServiceAccount
  name: litmus-admin
  namespace: litmus

info

The rolebinding subjects point to the hce service account only (in HCE namespace).

Create the roles

$> kubectl apply -f role-1.yaml

role.rbac.authorization.k8s.io/namespaceA-chaos created rolebinding.rbac.authorization.k8s.io/namespaceA-chaos created

$> kubectl apply -f role-2.yaml

role.rbac.authorization.k8s.io/namespaceB-chaos created rolebinding.rbac.authorization.k8s.io/namespaceB-chaos created

Step 4. Create Role and RoleBinding in all target namespaces

Create a Role and RoleBinding in the chaos namespace as well. This will be used by chaos runner pod to launch experiment.

Step 5. Verify the chaos execution on different namespaces

Run an experiment, say pod-delete in both the namespaces to verify if these namespaces have the permission to run experiments.

For namespaceA:

namespace A

For namespaceA:

namespace B

You are able to run chaos on the target namespaces (namespaceA and namespaceB). Now, you can check to see if namespace test can be used to run chaos experiments.

For test namespace:

The experiment fails to run and experiment pod logs.

time="2022-11-14T06:47:47Z" level=info msg="Experiment Name: pod-delete"
time="2022-11-14T06:47:47Z" level=info msg="[PreReq]: Getting the ENV for the pod-delete experiment"
time="2022-11-14T06:47:49Z" level=info msg="[PreReq]: Updating the chaos result of pod-delete experiment (SOT)"
time="2022-11-14T06:47:51Z" level=info msg="The application information is as follows" Namespace=test Label="app=nginx" Chaos Duration=30
time="2022-11-14T06:47:51Z" level=info msg="[Status]: Verify that the AUT (Application Under Test) is running (pre-chaos)"
time="2022-11-14T06:47:51Z" level=info msg="[Status]: Checking whether application containers are in ready state"
time="2022-11-14T06:50:53Z" level=error msg="Application status check failed, err: Unable to find the pods with matching labels, err: pods is forbidden: User \"system:serviceaccount:litmus:litmus-admin\" cannot list resource \"pods\" in API group \"\" in the namespace \"test\""

You have successfully restricted CE to run chaos on certain namespaces instead of all namespaces. kubectl delete clusterrole hce

Step 1. Install CE in cluster mode​

Step 2. Delete hce ClusterRole and ClusterRoleBinding​

Step 3. Create Role and RoleBinding in all target namespaces​

Create the roles​

Step 4. Create Role and RoleBinding in all target namespaces​

Step 5. Verify the chaos execution on different namespaces​