Skip to main content

SLO probe

Service Level Objective (SLO) probes let users validate the error budget for a given SLO when the corresponding application is subject to chaos and determine the verdict based on the percentage change in the error budget. The probe leverages the API from the Service Reliability Management (SRM) module and fetches the error budget values during the chaos execution time period. The success of a chaos probe can be defined based on the drop in the percentage of the error budget values. The percentage drop is defined by the user in the probe configuration.

Probe definition

You can define the probes at .spec.experiments[].spec.probe path inside the chaos engine.

kind: Workflow
apiVersion: argoproj.io/v1alpha1
spec:
templates:
- inputs:
artifacts:
- raw:
data: |
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
spec:
experiments:
- spec:
probe:
####################################
Probes are defined here
####################################

Schema

Listed below is the probe schema for the SLO probe, with properties shared across all the probes and properties unique to the SLO probe.

FieldDescriptionTypeRangeNotes
nameFlag to hold the name of the probeMandatoryN/A type: stringThe name holds the name of the probe. It can be set based on the use case.
typeFlag to hold the type of the probeMandatoryhttpProbe, k8sProbe, cmdProbe, promProbe, and datadogProbe.The type supports five types of probes: httpProbe, k8sProbe, cmdProbe, promProbe, and datadogProbe.
modeFlag to hold the mode of the probeMandatoryEOT, Edge, Continuous, OnChaosThe mode supports four modes of probes. SLO Probe supports EOT mode since the SRM API is called post the chaos execution.
platformEndpointFlag to hold the platform endpointMandatoryN/Atype: stringThe platformEndpoint stores the value of NG manager platform endpoint. For example, https://app.harness.io/gateway/cv/api
sloIdentifierFlag to hold the slo identifier of the SLOMandatoryN/Atype: stringThe sloIdentifier field consists of the SLO identifier for which the error budget is calculated.
evaluationTimeoutFlags to hold the total evaluation time for the probeOptionalN/A type: stringThe evaluationTimeout is the time period for which the error budget values are fetched and based on the chaos execution time period, the percentage change is calculated.
insecureSkipVerifyFlag to skip certificate checksOptionalboolThe insecureSkipVerify contains flag to skip certificate checks.
sloSourceMetadataMandatorySLO source metadatastring Comprises of identifiers used to fetch the details from SRM module. It includes APITokenSecret which is required to authenticate the request and the scope for the SLO entity.

SLO source metadata

FieldDescriptionTypeRangeNotes
apiTokenSecretFlag to hold API Token secretMandatoryN/A type: stringThe apiTokenSecret contains the API Token. The secret should be added with X-API-KEY as the key and should be present in the same namespace where experiment is running.
scopeFlag to hold scopeMandatory N/A Identifier such as accountID, orgID and projectID

Comparator

FieldDescriptionTypeRangeNotes
typeFlag to hold type of the data used for comparisonOptionalfloatThe type contains type of data, which should be compared as part of comparison operation.
criteriaFlag to hold criteria for the comparisonMandatoryIt supports >=, <=, ==, >, <, !=, oneOf, between for int & float type. And equal, notEqual, contains, matches, notMatches, oneOf for string type.The criteria contains criteria of the comparison, which should be fulfill as part of comparison operation.
valueFlag to hold value for the comparisonMandatoryN/A type: stringThe value contains value of the comparison, which should follow the given criteria as part of comparison operation.

Evaluation window

Field Description Type Range Notes
evaluationStartTimeFlag to hold the evaluation start time of the probeOptionalpositive integer It represents the start time of the probe evaluation
evaluationEndTime Flag to hold the evaluation end time of the probe Optional positive integerIt represents the end time of the probe evaluation

Run properties

FieldDescriptionTypeRangeNotes
probeTimeoutFlag to hold the timeout of the probeMandatoryN/A type: stringThe probeTimeout represents the time limit for the probe to execute the specified check and return the expected data
attemptFlag to hold the attempt of the probeMandatoryN/A type: integerThe attempt contains the number of times a check is run upon failure in the previous attempts before declaring the probe status as failed.
intervalFlag to hold the interval of the probeMandatoryN/A type: stringThe interval contains the interval for which probes waits between subsequent retries
initialDelaySecondsFlag to hold the initial delay interval for the probesOptionalN/A type: integerThe initialDelaySeconds represents the initial waiting time interval for the probes.
stopOnFailureFlags to hold the stop or continue the experiment on probe failureOptionalN/A type: booleanThe stopOnFailure can be set to true/false to stop or continue the experiment execution after probe fails. When the experiment is stopped and the probe fails, the experiment along with the associated faults are aborted. This is applicable only for chaos experiments that use a Kubernetes infrastructure (dedicated infrastructure or Harness Delegate).

Definition

probe:
- name: "slo-probe"
type: "sloProbe"
sloProbe/inputs:
platformEndpoint: "<platform-endpoint>"
sloIdentifier: "<slo-identifier>"
evaluationTimeout: 5m
sloSourceMetadata:
apiTokenSecret: "<api-token>"
scope:
accountIdentifier: "<account-identifier>"
orgIdentifier: "<org-identifier>"
projectIdentifier: "<project-identifier>"
comparator:
type: float
criteria: <
value: "0.1"
mode: "EOT"
runProperties:
attempt: 2
probeTimeout: 1000ms
stopOnFailure: false