Pod autoscaler
Pod autoscaler is a Kubernetes pod-level chaos fault that determines whether nodes can accomodate multiple replicas of a given application pod. This fault examines the node auto-scaling feature by determining whether the pods were successfully rescheduled within a specified time frame if the existing nodes are running at the specified limits.
Use cases
Pod autoscaler determines how an application accomodates multiple replicas of a given application pod at unexpected times.
Permissions required
Below is a sample Kubernetes role that defines the permissions required to execute the fault.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: hce
name: pod-autoscaler
spec:
definition:
scope: Cluster # Supports "Namespaced" mode too
permissions:
- apiGroups: [""]
resources: ["pods"]
verbs: ["create", "delete", "get", "list", "patch", "deletecollection", "update"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "get", "list", "patch", "update"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["deployments, statefulsets"]
verbs: ["get", "list", "patch", "update"]
- apiGroups: [""]
resources: ["chaosEngines", "chaosExperiments", "chaosResults"]
verbs: ["create", "delete", "get", "list", "patch", "update"]
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["create", "delete", "get", "list", "deletecollection"]
Prerequisites
- Kubernetes > 1.16
- The application pods should be in the running state before and after injecting chaos.
Mandatory tunables
Tunable | Description | Notes |
---|---|---|
REPLICA_COUNT | Number of replicas you wish to scale to. | nil . For more information, go to replica counts |
NODE_LABEL | Node label used to filter the target node if TARGET_NODE environment variable is not set. | It is mutually exclusive with the TARGET_NODE environment variable. If both are provided, the fault uses TARGET_NODE . For more information, go to node label. |
Optional tunables
Tunable | Description | Notes |
---|---|---|
TOTAL_CHAOS_DURATION | Duration for which to insert chaos (in seconds). | Default: 60 s. For more information, go to duration of the chaos. |
RAMP_TIME | Period to wait before and after injecting chaos (in seconds). | For example, 30 s. For more information, go to ramp time |
Replica counts
Number of replicas that need to be present in the target application during chaos. Tune it by using the REPLICA_COUNT
environment variable.
The following YAML snippet illustrates the use of this environment variable:
# provide the number of replicas
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: engine-nginx
spec:
engineState: "active"
annotationCheck: "false"
appinfo:
appns: "default"
applabel: "app=nginx"
appkind: "deployment"
chaosServiceAccount: litmus-admin
experiments:
- name: pod-autoscaler
spec:
components:
env:
# number of replica, needs to scale
- name: REPLICA_COUNT
value: "3"
- name: TOTAL_CHAOS_DURATION
VALUE: "60"