Manage SLOs using Prometheus metrics data
Background on Service Level Objectives
In technology, the adage that you can not improve what you can’t measure is very true. Indicators and measurements of how well a system is performing can be represented by one of the Service Level (SLx) commitments. There is a trio of metrics, SLAs, SLOs, and SLIs, that paint a picture of the agreement made vs the objectives and actuals to meet the agreement. Focusing on the SLO or Service Level objectives, those are the goals to meet in your system.
Service Level Objectives are goals that need to be met in order to meet Service Level Agreements [SLAs]. Looking at Tom Wilkie’s RED Method can help you come up with good metrics for SLOs: requests, errors, and duration. Google’s Four Golden Signals are also great metrics to have as SLOs, but also includes saturation.
For example, there might be an SLA defined by the business as “we require 99% uptime”. The SLO to make that happen would be “we need to reply in 1000 ms or less 99% of the time” to meet that agreement.
Managing and Measuring Your SLOs
Drawing a conclusion can always be tricky especially if data is coming from different sources and services. If you had one and only one service in your organization, the amount of system and business knowledge about this one service would be easy to disseminate. Though that is not the case for any organization as the number of services increase and domain expertise does not stay within a singular individual.
A myth about SLOs is that they are static in nature. As technology, capabilities, and features change, SLOs need to adapt with them. In an age of dial up internet, the example SLO of “we need to reply in 1000ms or less 99% of the time” would be impossible. As cloud infrastructure and internet speeds increased over the decades, that SLO seems very possible.
SLIs are used to measure your SLOs. SLO Management would not be possible without including the SLIs. In the response example, the SLI would be an actual response time which the SLO tracks against. In the below example, we will be setting up an SLO and SLI.
Getting Started with SLO Management
Harness provides a module called Service Reliability Management to help with your SLO Management, if you have not already, request a Harness SRM Account. Once signed up, the next step to on-ramp you to the Harness Platform is to install a Harness Delegate.
In this example, will use Prometheus, an open source monitoring solution, to intercept metrics from an example application. The Open Observability Group has an example application which can be deployed to Kubernetes that writes to Prometheus metrics.
Install Prometheus
An easy way to install Prometheus on your Kubernetes cluster is to use Helm.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm upgrade --install prometheus prometheus-community/prometheus \
--namespace prometheus --create-namespace
Once installed, there are a few ways to access your Prometheus Web UI. It is not recommended with workloads of substance to expose this to the public. For this example, can expose via NodePort.
kubectl expose deployment -n prometheus prometheus-server --type=NodePort --name=prometheus-service
With a NodePort, you access the Service deployed on the cluster via your browser with node_public_ip:nodeport.
You can find your a Kubernetes node’s public IP by running:
kubectl get nodes -o wide
Then can grab the NodePort.
kubectl get svc -n prometheus
In this case, the node_ip:nodeport combo is http://35.223.10.37:31796.
Note: If you are using a cloud rendition of Kubernetes e.g. EKS/GKE/AKS, by default your firewall might not allow for TCP traffic over NodePort range. Can open up specifically for each NodePort or give a range to cover all NodePorts; TCP ports 30000-32768.
Now you are ready to deploy an application that writes to Prometheus.
Deploying an Application That Writes to Prometheus
Following the Open Observability Group’s Sample Application, you can build from source or use an already built rendition that we have built.
Kubectl apply -f https://raw.githubusercontent.com/harness-apps/developer-hub-apps/main/applications/prometheus-sample-app/prometheus-sample-app-k8s-deployment.yaml
With the application installed, now you can explore some metrics with Prometheus then wire those metrics to Harness.
Prometheus Metrics
Prometheus groups metrics in several ways. There are four metric primitive types that Prometheus supports. Querying these metrics are handled by Prometheus’s query language, or PromQL.
If this is your first time delving into Prometheus or just want to find out more about what your applications are sending in, this Prometheus Blog is a good resource to explore your metrics when the metric names are unknown. Below is a PromQL query to list all available metrics in your Prometheus instance.
group by(__name__) ({__name__!=""})
Running that query, will notice all four metric types are being written by the example application.
The test_gauge0 metric is a good metric to take a look at. A Gauge in Prometheus is a metric that represents a singular numerical value. How the sample application is designed will increase and decrease the gauge counter over time, which if this was a real life gauge could represent something like memory pressure or response time.
With this metric, you are now able to start to manage this metric.
Getting Started With Your First SLO
Wiring in your service metrics/telemetry as SLOs to Harness SRM has a few Harness Objects to be created. If you have not already, request to sign up for a Harness SRM Account. If this is your first time leveraging Harness, Harness has a concept of Projects. The Default Project is more than adequate to wire in your first SLO.
Install Delegate
You will also need to wire in a Kubernetes Delegate if you have not done so already.
Install Delegate
Install Delegate
What is a Delegate?
Harness Delegate is a lightweight worker process that is installed on your infrastructure and communicates only via outbound HTTP/HTTPS to the Harness Platform. This enables the Harness Platform to leverage the delegate for executing the CI/CD and other tasks on your behalf, without any of your secrets leaving your network.
You can install the Harness Delegate on either Docker or Kubernetes.
Install Delegate
Create New Delegate Token
Login to the Harness Platform and go to Account Settings -> Account Resources -> Delegates. Click on the Tokens tab. Click +New Token and give your token a name `firstdeltoken`. When you click Apply, a new token is generated for you. Click on the copy button to copy and store the token in a temporary file for now. You will provide this token as an input parameter in the next delegation installation step. The delegate will use this token to authenticate with the Harness Platform.Get Your Harness Account ID
Along with the delegate token, you will also need to provde your Harness accountId as an input parameter to the delegate installation. This accountId is present in every Harness URL. For example, in the following URL
https://app.harness.io/ng/#/account/6_vVHzo9Qeu9fXvj-AcQCb/settings/overview
6_vVHzo9Qeu9fXvj-AcQCb
is the accountId.
Now you are ready to install the delegate on either Docker or Kubernetes.
- Docker
- Kubernetes
Prerequisite
Ensure that you have the Docker runtime installed on your host. If not, use one of the following options to install Docker:
Install on Docker
Now you can install the delegate using the following command.
docker run -d --name="firstdockerdel" --cpus="0.5" --memory="2g" \
-e DELEGATE_NAME=firstdockerdel \
-e NEXT_GEN=true \
-e DELEGATE_TYPE=DOCKER \
-e ACCOUNT_ID=PUT_YOUR_HARNESS_ACCOUNTID_HERE \
-e DELEGATE_TOKEN=PUT_YOUR_DELEGATE_TOKEN_HERE \
-e MANAGER_HOST_AND_PORT=PUT_YOUR_MANAGER_HOST_AND_PORT_HERE \
harness/delegate:22.11.77436
PUT_YOUR_MANAGER_HOST_AND_PORT_HERE
should be replaced by the Harness Manager Endpoint noted below. For Harness SaaS accounts, you can find your Harness Cluster Location in the Account Overview page under Account Settings section of the left navigation. For Harness CDCE, the endpoint varies based on the Docker vs. Helm installation options.
Harness Cluster Location | Harness Manager Endpoint on Harness Cluster |
---|---|
SaaS prod-1 | https://app.harness.io |
SaaS prod-2 | https://app.harness.io/gratis |
SaaS prod-3 | https://app3.harness.io |
CDCE Docker | http://<HARNESS_HOST> if Docker Delegate is remote to CDCE or http://host.docker.internal if Docker Delegate is on same host as CDCE |
CDCE Helm | http://<HARNESS_HOST>:7143 where HARNESS_HOST is the public IP of the Kubernetes node where CDCE Helm is running |
Verify Docker Delegate Connectivity
Click Continue and in a few moments after the health checks pass, your Docker Delegate will be available for you to leverage. Click Done and can verify your new Delegate is on the list.
Prerequisite
Ensure that you access to a Kubernetes cluster. For the purposes of this tutorial, we will use minikube
.
Install minikube
- On Windows:
choco install minikube
- On macOS:
brew install minikube
Now start minikube with the following config.
minikube start --memory 4g --cpus 4
Validate that you have kubectl access to your cluster.
kubectl get pods -A
Now that you have access to a Kubernetes cluster, you can install the delegate using any of the options below.
- Helm Chart
- Terraform Helm Provider
- Kubernetes Manifest
Download Helm Chart Values YAML
curl -LO https://raw.githubusercontent.com/harness-apps/developer-hub-apps/main/delegate/harness-delegate-values.yaml
Install Helm Chart
As a prerequisite, you should have Helm v3 installed on the machine from which you connect to your Kubernetes cluster.
You can now install the delegate using the Delegate Helm Chart. Let us first add the harness
helm chart repo to your local helm registry.
helm repo add harness https://app.harness.io/storage/harness-download/harness-helm-charts/
helm search repo harness
You can see that there are two helm charts available. We will use the harness/harness-delegate-ng
chart in this tutorial.
NAME CHART VERSION APP VERSION DESCRIPTION
harness/harness-delegate 1.0.8 Delegate for Harness FirstGen Platform
harness/harness-delegate-ng 1.0.0 1.16.0 Delegate for Harness NextGen Platform
Now we are ready to install the delegate. The following command installs/upgrades firstk8sdel
delegate (which is a Kubernetes workload) in the harness-delegate-ng
namespace by using the harness/harness-delegate-ng
helm chart. The configuration provided in the harness-delegate-values.yaml
is used for this install/upgrade. Note that we downloaded this values yaml file in the previous step.
helm upgrade -i firstk8sdel \
--namespace harness-delegate-ng --create-namespace \
harness/harness-delegate-ng \
-f harness-delegate-values.yaml \
--set delegateName=firstk8sdel \
--set accountId=PUT_YOUR_HARNESS_ACCOUNTID_HERE \
--set delegateToken=PUT_YOUR_DELEGATE_TOKEN_HERE \
--set managerEndpoint=PUT_YOUR_MANAGER_HOST_AND_PORT_HERE
PUT_YOUR_MANAGER_HOST_AND_PORT_HERE
should be replaced by the Harness Manager Endpoint noted below. For Harness SaaS accounts, you can find your Harness Cluster Location in the Account Overview page under Account Settings section of the left navigation. For Harness CDCE, the endpoint varies based on the Docker vs. Helm installation options.
Harness Cluster Location | Harness Manager Endpoint on Harness Cluster |
---|---|
SaaS prod-1 | https://app.harness.io |
SaaS prod-2 | https://app.harness.io/gratis |
SaaS prod-3 | https://app3.harness.io |
CDCE Docker | http://<HARNESS_HOST> if Docker Delegate is remote to CDCE or http://host.docker.internal if Docker Delegate is on same host as CDCE |
CDCE Helm | http://<HARNESS_HOST>:7143 where HARNESS_HOST is the public IP of the Kubernetes node where CDCE Helm is running |
Verify Helm Delegate Connectivity
Click Continue and in a few moments after the health checks pass, your Harness Delegate will be available for you to leverage. Click Done and can verify your new Delegate is on the list.
Clone Terraform Module Repo
Harness has created a github repo that stores the terraform module for the Kubernetes delegate. This module uses the standard terraform Helm provider to install the helm chart onto a Kubernetes cluster whose config is stored in the same machine at the ~/.kube/config
path. You can change this path in the providers.tf
file after cloning.
git clone [email protected]:harness/terraform-kubernetes-harness-delegate.git
Run terraform init, plan and apply
Initialize terraform. This will download the terraform helm provider onto your machine.
terraform init
Run the following step to see exactly the changes terraform is going to make on your behalf.
terraform plan
Finally, run this step to make terraform install the Kubernetes delegate using the Helm provider.
terraform apply \
-var delegate_name="firstk8sdel" \
-var account_id="PUT_YOUR_HARNESS_ACCOUNTID_HERE" \
-var delegate_token="PUT_YOUR_DELEGATE_TOKEN_HERE" \
-var manager_endpoint="PUT_YOUR_MANAGER_HOST_AND_PORT_HERE" \
-var delegate_image="harness/delegate:22.11.77436"
PUT_YOUR_MANAGER_HOST_AND_PORT_HERE
should be replaced by the Harness Manager Endpoint noted below. For Harness SaaS accounts, you can find your Harness Cluster Location in the Account Overview page under Account Settings section of the left navigation. For Harness CDCE, the endpoint varies based on the Docker vs. Helm installation options.
Harness Cluster Location | Harness Manager Endpoint on Harness Cluster |
---|---|
SaaS prod-1 | https://app.harness.io |
SaaS prod-2 | https://app.harness.io/gratis |
SaaS prod-3 | https://app3.harness.io |
CDCE Docker | http://<HARNESS_HOST> if Docker Delegate is remote to CDCE or http://host.docker.internal if Docker Delegate is on same host as CDCE |
CDCE Helm | http://<HARNESS_HOST>:7143 where HARNESS_HOST is the public IP of the Kubernetes node where CDCE Helm is running |
When prompted by terraform if you want to continue with the apply step, type yes
and then you will see output similar to the following.
helm_release.delegate: Creating...
helm_release.delegate: Still creating... [10s elapsed]
helm_release.delegate: Still creating... [20s elapsed]
helm_release.delegate: Still creating... [30s elapsed]
helm_release.delegate: Still creating... [40s elapsed]
helm_release.delegate: Still creating... [50s elapsed]
helm_release.delegate: Still creating... [1m0s elapsed]
helm_release.delegate: Creation complete after 1m0s [id=firstk8sdel]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Verify Helm Delegate Connectivity
Click Continue and in a few moments after the health checks pass, your Harness Delegate will be available for you to leverage. Click Done and can verify your new Delegate is on the list.
Download Kubernetes Manifest Template
curl -LO https://raw.githubusercontent.com/harness-apps/developer-hub-apps/main/delegate/harness-delegate.yml
Open the harness-delegate.yml
file in a text editor and replace PUT_YOUR_DELEGATE_NAME_HERE
, PUT_YOUR_HARNESS_ACCOUNTID_HERE
and PUT_YOUR_DELEGATE_TOKEN_HERE
with your delegate name (say firstk8sdel
), Harness accountId, delegate token value respectively.
PUT_YOUR_MANAGER_HOST_AND_PORT_HERE
should be replaced by the Harness Manager Endpoint noted below. For Harness SaaS accounts, you can find your Harness Cluster Location in the Account Overview page under Account Settings section of the left navigation. For Harness CDCE, the endpoint varies based on the Docker vs. Helm installation options.
Harness Cluster Location | Harness Manager Endpoint on Harness Cluster |
---|---|
SaaS prod-1 | https://app.harness.io |
SaaS prod-2 | https://app.harness.io/gratis |
SaaS prod-3 | https://app3.harness.io |
CDCE Docker | http://<HARNESS_HOST> if Docker Delegate is remote to CDCE or http://host.docker.internal if Docker Delegate is on same host as CDCE |
CDCE Helm | http://<HARNESS_HOST>:7143 where HARNESS_HOST is the public IP of the Kubernetes node where CDCE Helm is running |
Apply Kubernetes Manifest
kubectl apply -f harness-delegate.yml
Verify Kubernetes Manifest Delegate Connectivity
Click Continue and in a few moments after the health checks pass, your Harness Delegate will be available for you to leverage. Click Done and can verify your new Delegate is on the list.
You can now route communication to external systems in Harness connectors and pipelines by simply selecting this delegate via a delegate selector.
Creating Your First SLO
In the Harness Platform, head to Service Reliability -> SLOs inside the Default Project.
Click on + Create SLO. Can name the SLO, “myslo”. Harness does need to define what you will be monitoring, which is a Monitored Service. In the Monitored Service Name, create a new Monitored Service.
- Service [create in-line]: my-slo-app
- Environment [create-in-line]: kubernetes
Click Save. Can also name this journey that a user or system will be taking on by creating a new User Journey object. Can create a new User Journey object called “myjourney”.
Click Continue. Now you are ready to wire in the Service Level Indicator [SLI] that feed into this SLO. Since the test_summary0_sum{} metric has a consistent upward trend, this can be used to simulate a latency metric. Now you are ready to configure wiring in Prometheus to Harness.
Configure SLI queries -> + New Health Source
Configuring Prometheus to Harness, add a new Prometheus Health Source with a name of “myprominstance”.
Create a new Prometheus Connector with the name “kubernetesprom”.
In the Credentials Section, can give the NodePort address or the address on how you exposed Prometheus’s 9090 port.
Click Next and select a Harness Delegate to perform this connection. Can select any available Harness Delegate.
Click Save and your connection to Prometheus will be validated. Now you are ready to wire in the Prometheus Query as a Health Source.
Click Next after wiring in the Connector. Using the Build your Query, the metric we want to focus on is “test_gauge0” and will filter on the “app” field with the example app label, “prometheus-sample-app”. We can duplicate the filter on Environment and Service since that is the singular metric we want.
- Prometheus Metric: test_gauge0
- Environment Filter: app:prometheus-sample-app
- Service Filter: app:prometheus-sample-app
In the Assign section, select this as an SLI.
Click Save and now you can pick the metrics powering the SLI. Taking a closer look at the sample Gauge in Prometheus, a sample of 0.60 seems to be a good midpoint on this metric going up and down. We can pretend that this Gauge represents some sort of response time metric and the lower the score, the better.
When configuring the SLI, can set this to a Threshold based metric. The Objective Value of what we are stating is “good” is less than or equal to 0.60. If data is missing in our Gauge, we can also consider this “bad”.
Metric for valid requests: Prometheus Metric [was connected during the connecting step].
- Objective Value: 0.60
- SLI value is good if: <=
- Consider missing data: Bad
Set SLO Target
Click Continue to set up the SLO Target [based on the SLI] and Error Budget [amount of time system can fail] Policy. A goal we can set is that 50% of requests need to be <= to our Objective Value e.g this is our SLI. Since we are setting 50% of the target, we are also stating that 50% of the week if we set a rolling 7 day period can be included in our Error Budget which is indicated by Harness.
Click Save and now you have the ability to actively monitor and manage your SLOs. SLOs can be renegotiated much easier with Harness without having to calculate them.
If this SLO is too aggressive or too lenient, Harness can provide the actual service data to help make that determination. In this example, we set the SLO target at 50% which is not a very good SLO. Changing the SLO target to be more aggressive, for example 99%, can be changed via the UI.