Skip to main content

K6 loadgen

Last updated on

K6 loadgen is a chaos fault that runs a k6 load-test script against a target endpoint from a helper pod inside the chaos infrastructure cluster for TOTAL_CHAOS_DURATION seconds, then stops. The script is read from a Kubernetes secret (SCRIPT_SECRET_NAME/SCRIPT_SECRET_KEY) and run by the official Grafana k6 runner image (LOAD_IMAGE), so the load profile (smoke, spike, stress, soak) is fully driven by the script.

Use this fault to test how a target workload behaves under sustained synthetic load: whether application latency stays inside the SLA, whether autoscaling kicks in, whether circuit breakers and rate limiters work as expected, and whether monitoring detects the saturation within the alerting SLA.

Run your first experiment

If you have not configured the chaos infrastructure yet, go to Quickstart to install the chaos infrastructure and run an experiment end to end.


Use cases

Run this fault when you want to answer concrete questions like:

  • API latency under load: When the script drives N virtual users against a REST endpoint, does p95/p99 stay inside the SLA?
  • Autoscaling fidelity: Does HPA, KEDA, or a custom autoscaler add capacity inside the alerting SLA?
  • Rate limiting: Does the rate limiter return 429s correctly under sustained burst traffic without leaking errors downstream?
  • Continuous validation in pipelines: Catch performance regressions early by running k6 scripts as part of a CI/CD pipeline.

Prerequisites

  • Kubernetes version: 1.21 or later for the cluster running the chaos infrastructure.

  • k6 script secret: A Kubernetes secret in the chaos infrastructure namespace containing the k6 JavaScript file. The default key is script.js. Create it with:

    kubectl create secret generic k6-script \
    --from-file=script.js=<path-to-script>.js \
    -n <chaos-infrastructure-namespace>
  • Target endpoint reachable: The endpoint addressed in the k6 script (http.get('https://...')) is reachable from the chaos infrastructure cluster.

  • Image accessible: LOAD_IMAGE (default ghcr.io/grafana/k6-operator:latest-runner) is pullable from the cluster, or mirror it to your own registry and override the tunable.


Supported environments

PlatformSupport status
Self-hosted Kubernetes (1.21+)Supported
Managed Kubernetes (EKS, GKE, AKS, OKE)Supported
OpenShiftSupported
Targets running on AWS, Azure, GCP, or any reachable endpointSupported

Permissions required

This fault is classified as a Basic Load fault. The chaos service account needs the following Kubernetes RBAC permissions.

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: hce
name: k6-loadgen
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["create", "delete", "get", "list", "patch", "deletecollection", "update"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "get", "list", "patch", "update"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list"]
- apiGroups: ["litmuschaos.io"]
resources: ["chaosengines", "chaosexperiments", "chaosresults"]
verbs: ["create", "delete", "get", "list", "patch", "update"]
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["create", "delete", "get", "list", "deletecollection"]
- apiGroups: [""]
resources: ["configmaps", "secrets"]
verbs: ["get", "list", "watch"]

Fault tunables

Configure the following fault parameters when you add K6 loadgen to an experiment in Chaos Studio. Defaults are shown for reference.

Required parameters

TunableDescriptionDefault
SCRIPT_SECRET_NAMEName of the Kubernetes secret in the chaos infrastructure namespace that holds the k6 JavaScript file.(required)
SCRIPT_SECRET_KEYKey inside the secret that points to the k6 script (for example script.js).(required)

Chaos parameters

TunableDescriptionDefault
TOTAL_CHAOS_DURATIONTotal duration of the fault, in seconds. k6 runs for this period regardless of the duration declared inside the script.60
LOAD_IMAGEContainer image used to run k6 inside the helper pod.ghcr.io/grafana/k6-operator:latest-runner
RAMP_TIMEWait period in seconds before and after the fault. Go to ramp time to read how it is applied.0

Tunables that apply to every fault are documented in common tunables for all faults.


Sample k6 script

The contents of the secret are passed straight to k6. Any script supported by the runner image works.

import http from "k6/http";
import { check, sleep } from "k6";

export const options = {
vus: 50,
duration: "60s",
};

export default function () {
const res = http.get("https://api.example.com/health");
check(res, { "status is 200": (r) => r.status === 200 });
sleep(0.3);
}

Go to the k6 documentation to read the script reference and the test-type catalog (smoke, load, stress, spike, soak).


Fault execution in brief

Reads the k6 script from SCRIPT_SECRET_NAME/SCRIPT_SECRET_KEY, mounts it into a helper pod that runs LOAD_IMAGE, executes k6 run for TOTAL_CHAOS_DURATION seconds, then tears the helper pod down.


Expected behavior during fault execution

  • The target endpoint sees sustained synthetic traffic for TOTAL_CHAOS_DURATION seconds at the rate driven by the script's vus and duration settings.
  • Application metrics on the target (latency, throughput, error rate) shift in line with the load profile.
  • Autoscalers (HPA, KEDA) may add capacity if CPU/RPS thresholds are reached.
  • After the duration ends, the helper pod is deleted; traffic from the fault stops within seconds.
When the fault ends

The chaos pod stops k6 run and deletes the helper pod when TOTAL_CHAOS_DURATION elapses. Synthetic traffic stops within seconds; in-flight requests complete naturally.

Signals to watch

Attach resilience probes to assert each layer:

  • Application latency: Use a Prometheus probe on the application's request-duration histogram and assert p95/p99 stays inside the SLA.
  • Error rate: Use a Prometheus probe on the application's 5xx counter and assert it stays below threshold.
  • Autoscaling reaction: Use a command probe running kubectl get hpa <name> to assert replicas grew.

Verify the fault execution effect

  1. Watch the helper pod logs for k6 output.

    kubectl logs -n <chaos-infra-namespace> -l name=k6-load-generator -f

    You should see k6's checks-passed / failed lines and the final summary table.

  2. Inspect target metrics.

    Use your APM tool (Prometheus, Datadog, New Relic) to confirm RPS and latency rose during the chaos window and recovered afterwards.

  3. Confirm the helper pod was cleaned up.

    kubectl get pods -n <chaos-infra-namespace> -l name=k6-load-generator

    The pod should be gone after the experiment ends.


Recovery and cleanup

  • End of duration: The chaos pod stops k6 and deletes the helper pod when TOTAL_CHAOS_DURATION elapses.
  • Abort the experiment: Stopping the experiment from Chaos Studio also stops k6 and cleans up the helper pod.
  • Manual recovery: If the helper pod survives an abort, delete it with kubectl delete pods -n <chaos-infra-namespace> -l name=k6-load-generator.
  • Workload recovery: Application metrics recover as soon as synthetic traffic stops; HPA-driven replicas scale back in over the autoscaler cooldown.

Limitations

  • Single endpoint per script: The script controls the targets; running k6 against many endpoints requires that the script address them.
  • Cluster network egress: Synthetic traffic leaves the chaos infrastructure cluster; egress costs and per-host rate limits apply.
  • No mid-flight reconfigure: Script changes require re-uploading the secret and re-running the experiment.
  • Image hosted on GHCR: The default LOAD_IMAGE pulls from ghcr.io. If your cluster cannot reach GHCR, mirror the image to your own registry and override LOAD_IMAGE.
  • Script-driven duration: If the script declares a longer duration than TOTAL_CHAOS_DURATION, the fault still terminates at TOTAL_CHAOS_DURATION; align both for accurate reporting.

Troubleshooting

K6 loadgen secret not found in Harness Chaos Engineering

The secret SCRIPT_SECRET_NAME must exist in the chaos infrastructure namespace (the namespace where the chaos infra runs). Create it with kubectl create secret generic <name> --from-file=script.js=<path> -n <namespace>.

K6 helper pod stuck in ImagePullBackOff

LOAD_IMAGE defaults to ghcr.io/grafana/k6-operator:latest-runner. If your cluster cannot reach ghcr.io, mirror the image to a registry you can reach and set LOAD_IMAGE to that path.

Target endpoint not reachable from the chaos infrastructure cluster

Confirm the URL inside the k6 script is reachable from inside the cluster: kubectl run debug --image=alpine --rm -it -- wget <url>. Adjust network policies, security groups, or egress rules to allow traffic from the chaos infra namespace.

K6 exited with 'unknown protocol'

The script must use a supported k6 module (k6/http, k6/grpc, k6/ws). Validate the script locally with k6 run script.js before pushing it as a secret.


  • Locust loadgen: Generate load with Locust (Python-based) instead of k6.
  • Pod HTTP latency: Inject latency on the server side instead of driving load from outside.
  • Pod CPU hog: Stress server CPU instead of driving traffic.