Skip to main content

Create and run chaos experiments

Harness Chaos Engineering (HCE) gives you the flexibility to create elaborate chaos experiments that help create complex, real-life failure scenarios against which you can validate your applications. At the same time, the chaos experiments are declarative and you can construct them using the Chaos Studio user interface with no programmatic intervention.

A chaos experiment is composed of chaos faults that are arranged in a specific order to create a failure scenario. The chaos faults target various aspects of an application, including the constituent microservices and underlying infrastructure. You can tune the parameters associated with these faults to impart the desired chaos behavior.

For more information, go to flow of control in a chaos experiment.

Construct a chaos experiment

To add a chaos experiment:

  1. In Harness, navigate to Chaos > Chaos Experiments. Select + New Experiment.

    Chaos Experiments page

  2. In the Experiment Overview, enter the experiment Name and optional Description and Tags. In Select a Chaos Infrastructure, select the infrastructure where the target resources reside, and then click Next.

    Experiment Overview

tip

For more information on infrastructure, go to Connect chaos infrastructures.

  1. This takes you to the Experiment Builder tab, where you can choose how to start building your experiment.

    Experiment Builder

  2. Select how you want to build the experiment. The options, explained later, are:

    • Blank Canvas - Lets you build the experiment from scratch, adding the specific faults you want.
    • Templates from ChaosHubs - Lets you preview and select and experiment from pre-curated experiment templates available in ChaosHubs.
    • Upload YAML - Lets you upload an experiment manifest YAML file.

    These options are explained below.

  1. The Experiment Builder tab is displayed. Click Add to add a fault to the experiment

Experiment Builder tab with Add button

  1. Select the fault you want to add to the experiment individually.

Select Faults

  1. For each fault you select, tune the fault's properties. Properties will be different for different faults.

    • To tune each fault:

      • Specify the target application (only for pod-level Kubernetes faults): This lets the application's corresponding pods be targeted.

      target app

      • Tune fault parameters: Every fault has a set of common parameters, such as the chaos duration, ramp time, etc., and a set of unique parameters that may be customised as needed.

      • Add chaos probes: (Optional) On the Probes tab, you can add chaos probes to automate the chaos hypothesis checks for a fault during the experiment execution. Probes are declarative checks that aid in the validation of certain criteria that are deemed necessary to declare an experiment as passed.

      • Tune fault weightage: Set the weight for the fault, which sets the importance of the fault relative to the other faults in the experiments. This is used to calculate the resilience score of the experiment.

      Tune Fault

Construct the chaos fault using one of the three options mentioned earlier and save the experiment.

Save experiment options

  • Select Save to save the experiment to the Chaos Experiments page. You can add it to a ChaosHub later.
  • Select Add Experiment to ChaosHub to save this experiment as a template in a selected ChaosHub.

Run the experiment

Now, you can choose to either run the experiment right away by selecting the Run button on the top, or create a recurring schedule to run the experiment by selecting the Schedule tab.

Advanced experiment setup options

You can select Advanced Options on the Experiment Builder tab to configure the advanced options (described below) while creating an experiment for a Kubernetes chaos infrastructure:

Advanced Options

General options

Node Selector

Specifies the node on which the experiment pods will be scheduled. Provide the node label as a key-value pair.

  • Can be used with node-level faults to avoid the scheduling of the experiment pod on the target node(s).

  • Can be used to limit the scheduling of the experiment pods on nodes that have an unsupported OS.

    Node Selector

Toleration

Specifies the tolerations that must be satisfied by a tainted node to be able to schedule the experiment pods. For more information on taints and tolerations, go to the Kubernetes documentation.

  • Can be used with node-level faults to avoid the scheduling of the experiment pod on the target node(s).

  • Can be used to limit the scheduling of the experiment pods on nodes that have an unsupported OS.

    Toleration

Annotations

Specifies the annotations to be added to the experiment pods. Provide the annotations as key-value pairs. For more information on annotations, go to the Kubernetes documentation.

  • Can be used for bypassing network proxies enforced by service mesh tools like Istio.

    Annotations

Security options

Enable runAsUser

Specifies the user ID to be used for starting all the processes in the experiment pod containers. By default 1000 user ID is used.

  • Allows privileged access or restricted access for experiment pods

    runAsUser

Enable runAsGroup

Specifies the group ID to be used for starting all the processes in the experiment pod containers instead of a user ID.

  • Allows privileged access or restricted access for experiment pods

    runAsGroup