SLO probe

Service Level Objective (SLO) probes let users validate the error budget for a given SLO when the corresponding application is subject to chaos and determine the verdict based on the percentage change in the error budget. The probe leverages the API from the Service Reliability Management (SRM) module and fetches the error budget values during the chaos execution time period. The success of a chaos probe can be defined based on the drop in the percentage of the error budget values. The percentage drop is defined by the user in the probe configuration.

Probe definition

You can define the probes at .spec.experiments[].spec.probe path inside the chaos engine.

kind: Workflow
apiVersion: argoproj.io/v1alpha1
spec:
  templates:
    - inputs:
        artifacts:
          - raw:
              data: |
                apiVersion: litmuschaos.io/v1alpha1
                kind: ChaosEngine
                spec:
                  experiments:
                    - spec:
                        probe:
                          ####################################
                          Probes are defined here
                          ####################################

Schema

Listed below is the probe schema for the SLO probe, with properties shared across all the probes and properties unique to the SLO probe.

Field	Description	Type	Range	Notes
name	Flag to hold the name of the probe	Mandatory	N/A `type: string`	The `name` holds the name of the probe. It can be set based on the use case.
type	Flag to hold the type of the probe	Mandatory	`httpProbe, k8sProbe, cmdProbe, promProbe, and datadogProbe.`	The `type` supports five types of probes: httpProbe, k8sProbe, cmdProbe, promProbe, and datadogProbe.
mode	Flag to hold the mode of the probe	Mandatory	`EOT, Edge, Continuous, OnChaos`	The `mode` supports four modes of probes. SLO Probe supports EOT mode since the SRM API is called post the chaos execution.
platformEndpoint	Flag to hold the platform endpoint	Mandatory	N/A`type: string`	The `platformEndpoint` stores the value of NG manager platform endpoint. For example, `https://app.harness.io/gateway/cv/api`
sloIdentifier	Flag to hold the slo identifier of the SLO	Mandatory	N/A`type: string`	The `sloIdentifier` field consists of the SLO identifier for which the error budget is calculated.
evaluationTimeout	Flags to hold the total evaluation time for the probe	Optional	N/A `type: string`	The `evaluationTimeout` is the time period for which the error budget values are fetched and based on the chaos execution time period, the percentage change is calculated.
insecureSkipVerify	Flag to skip certificate checks	Optional	bool	The `insecureSkipVerify` contains flag to skip certificate checks.
sloSourceMetadata	Mandatory	SLO source metadata	string	Comprises of identifiers used to fetch the details from SRM module. It includes APITokenSecret which is required to authenticate the request and the scope for the SLO entity.

SLO source metadata

Field	Description	Type	Range	Notes
apiTokenSecret	Flag to hold API Token secret	Mandatory	N/A `type: string`	The `apiTokenSecret` contains the API Token. The secret should be added with `X-API-KEY` as the key and should be present in the same namespace where experiment is running.
scope	Flag to hold scope	Mandatory	N/A	Identifier such as accountID, orgID and projectID

Comparator

Field	Description	Type	Range	Notes
type	Flag to hold type of the data used for comparison	Optional	`float`	The `type` contains type of data, which should be compared as part of comparison operation.
criteria	Flag to hold criteria for the comparison	Mandatory	It supports `>=, <=, ==, >, <, !=, oneOf, between` for int & float type. And `equal, notEqual, contains, matches, notMatches, oneOf` for string type.	The `criteria` contains criteria of the comparison, which should be fulfill as part of comparison operation.
value	Flag to hold value for the comparison	Mandatory	N/A `type: string`	The `value` contains value of the comparison, which should follow the given criteria as part of comparison operation.

Evaluation window

Field	Description	Type	Range	Notes
evaluationStartTime	Flag to hold the evaluation start time of the probe	Optional	positive integer	It represents the start time of the probe evaluation
evaluationEndTime	Flag to hold the evaluation end time of the probe	Optional	positive integer	It represents the end time of the probe evaluation

Run properties

Field	Description	Type	Range	Notes
probeTimeout	Flag to hold the timeout of the probe	Mandatory	N/A `type: string`	The `probeTimeout` represents the time limit for the probe to execute the specified check and return the expected data
attempt	Flag to hold the attempt of the probe	Mandatory	N/A `type: integer`	The `attempt` contains the number of times a check is run upon failure in the previous attempts before declaring the probe status as failed.
interval	Flag to hold the interval of the probe	Mandatory	N/A `type: string`	The `interval` contains the interval for which probes waits between subsequent retries
initialDelaySeconds	Flag to hold the initial delay interval for the probes	Optional	N/A `type: integer`	The `initialDelaySeconds` represents the initial waiting time interval for the probes.
stopOnFailure	Flags to hold the stop or continue the experiment on probe failure	Optional	N/A `type: boolean`	The `stopOnFailure` can be set to true/false to stop or continue the experiment execution after probe fails

Definition

In the case of Dedicated Chaos Infrastructure, the following apply:

The mode and type are mandatory fields in the probe schema when you define the entire configuration of the probe in the manifest (for Kubernetes (Legacy), Linux, and Windows infrastructure).
The name, mode, type and other input properties (depending on the probe) is required to rightly configure the resilience probe. If all the necessary details are not provided, the probe will not execute.

In the case of Harness Delegate, the following apply:

For Kubernetes (Harness Infrastructure) (also known as DDCR), the mandatory fields are mode and probeID, and the type field is derived. These fields are generated and patched in the backend to the same manifest. However, in the UI, you will only see the mode and probeID fields when configuring your experiment. This is because the manifest is minified in the UI.
If you define the entire probe in task.definition.chaos.probes, the entire configuration is required. If you use the task.probeRef, you only need to specify probeID and mode fields.

probe:
  - name: "slo-probe"
    type: "sloProbe"
    sloProbe/inputs:
      platformEndpoint: "<platform-endpoint>"
      sloIdentifier: "<slo-identifier>"
      evaluationTimeout: 5m
      sloSourceMetadata:
        apiTokenSecret: "<api-token>"
        scope:
          accountIdentifier: "<account-identifier>"
          orgIdentifier: "<org-identifier>"
          projectIdentifier: "<project-identifier>"
        comparator:
          type: float
          criteria: <
          value: "0.1"
    mode: "EOT"
    runProperties:
      attempt: 2
      probeTimeout: 1000ms
      stopOnFailure: false

Probe definition​

Schema​

SLO source metadata​

Comparator​

Evaluation window​

Run properties​

Definition​