Flaky Test Detection

A flaky test is one that passes and fails inconsistently - without any code changes. Harness automatically detects these tests and marks them with a FLAKY badge so your team knows which tests are unreliable.

How Detection Works

Harness automatically detects flaky tests by analyzing test results across pipeline executions. A test is identified by its suite name + class name + repository + test name.

Detection Criteria

A test is marked flaky when both conditions are met:

Same-commit inconsistency: The test has both passed AND failed on the same commit
Within the observation window: The inconsistency occurred within the last 14 days

Example: If a build for commit abc123 ran twice - once where test passed and once where it failed - the test is marked flaky.

Why 14 days?

The 14-day window defines how far back Harness looks when detecting flaky tests. Test behavior older than 14 days is ignored. This ensures flaky detection reflects recent, relevant behavior—not issues that were fixed weeks ago.

Auto-Recovery

Flaky status is automatically cleared in two ways:

5 consecutive passes: After 5 successful runs with no flaky occurrences, the status clears immediately
Time expiry: If no same-commit inconsistency occurs for 14 days, the flaky status naturally expires

Detection Parameters

Parameter	Value
Observation window	14 days
Flaky trigger	Pass + fail on same commit
Auto-recovery	5 consecutive passes
Scope	Per repository

View Flaky Tests

In the Harness UI

Open your pipeline execution
Click the Tests tab
Look for the FLAKY badge next to test names
Use Filter → Flaky to show only flaky tests

Via CLI

List all flaky tests for a repository:

hcli test-management flaky get \
  --account-id="$HARNESS_ACCOUNT_ID" \
  --repo="https://github.com/your-org/your-repo.git" \
  --api-key="$HARNESS_API_KEY" \
  --endpoint="https://app.harness.io/gateway/ti-service"

Example output:

Found 6 flaky test(s):
  - com.example.PaymentTest::testRefundTimeout
  - com.example.ApiTest::testWebhookRetry
  - tests.integration.test_api::test_concurrent_requests

Manually Mark a Test

Sometimes you know a test is flaky before automatic detection catches it. Mark it manually:

hcli test-management flaky set \
  --account-id="$HARNESS_ACCOUNT_ID" \
  --repo="https://github.com/your-org/your-repo.git" \
  --api-key="$HARNESS_API_KEY" \
  --endpoint="https://app.harness.io/gateway/ti-service" \
  --class-name="com.example.PaymentTest" \
  --test-name="testRefundTimeout" \
  --marking=true

Marking Options

`--marking` value	Effect
`true`	Force mark as flaky (overrides auto-detection)
`false`	Force mark as stable (overrides auto-detection)
`unset`	Remove manual marking, let auto-detection decide

Flaky vs Quarantine

	Flaky	Quarantine
Test runs?	Yes	Yes
Failure blocks pipeline?	Yes (unless quarantined)	No
Purpose	Track unreliable tests	Unblock deployments
Recovery	Automatic (5 passes)	Manual removal or automatic via policies

A test can be both flaky AND quarantined. When quarantined, the flaky test runs but doesn't block the pipeline.

Common Causes of Flaky Tests

Cause	Example	Fix
Race conditions	Test depends on thread timing	Add synchronization or waits
External services	Test calls real APIs	Mock external dependencies
Shared state	Tests don't clean up after themselves	Isolate test data
Time sensitivity	Test checks "now" vs. expected time	Use fixed test clocks
Resource contention	Tests compete for ports/files	Use unique resources per test

Automate with Policies

Instead of manually managing flaky tests, use policies to automate quarantine:

[
  {
    "when": ["test is flaky", "test failed"],
    "action": ["mark quarantine"]
  }
]

This policy automatically quarantines any flaky test that fails, preventing it from blocking your pipeline while still tracking its status.

Next Steps

Quarantine flaky tests to unblock deployments
Set up policies to automate test management

How Detection Works​

Detection Criteria​

Auto-Recovery​

Detection Parameters​

View Flaky Tests​

In the Harness UI​

Via CLI​

Manually Mark a Test​

Marking Options​

Flaky vs Quarantine​

Common Causes of Flaky Tests​

Automate with Policies​

Next Steps​