Skip to main content

Cache Intelligence

Modern continuous integration systems execute pipelines inside ephemeral environments that are provisioned solely for pipeline execution and are not reused from prior pipeline runs. As builds often require downloading and installing many library and software dependencies, caching these dependencies for quick retrieval at runtime can save a significant amount of time.

There are several ways to configure caching in Harness CI, such as Cache Intelligence, Save and Restore Cache steps, and mounting volumes. Save and Restore Cache steps and mounted volumes require you to manage the cache. With Cache Intelligence, Harness automatically caches and restores common dependencies. Cache Intelligence doesn't require you to bring your own storage, because the cache is stored in the Harness-managed environment, Harness Cloud.

Prerequisites

Review these requirements for using Cache Intelligence.

Supported build infrastructures

Currently, Cache Intelligence is only available for Linux and Windows platforms on Harness Cloud, the Harness-managed build environment.

For other build infrastructures, you can use Save and Restore Cache steps, such as Save and Restore Cache from S3, to include caching in your CI pipelines.

Supported tools and paths

Cache Intelligence fully supports Bazel, Maven, Gradle, Yarn, Go, and Node build tools, if the dependencies are stored in the default location for that tool.

For other build tools or non-default cache locations, you can leverage Harness Cloud's cache storage by enabling Cache Intelligence and providing custom cache paths.

Cache storage

Cache Intelligence also doesn't require you to bring your own storage, because the cache is stored in the Harness-managed environment, Harness Cloud.

Harness Cloud allows up to 2GB of cache storage per account. All pipelines in the account use the same cache storage, and each build tool has a unique cache key that is used to restore the appropriate cache data at runtime.

The cache retention window is 15 days, which resets whenever the cache is updated.

Enable Cache Intelligence

  1. To enable Cache Intelligence in the YAML editor, add the following lines to the stage.spec:

    caching:
    enabled: true
  2. If you're using an unsupported build tool, a non-default cache location, or a Windows platform, you must add custom cache paths. For a list of supported tools, go to Supported tools and paths.

  3. You can also:

Here's a YAML example with Cache Intelligence enabled (enabled: true) and the cache policy and cache override settings configured:

- stage:
name: Build
identifier: Build
type: CI
spec:
caching:
enabled: true
policy: pull-push
override: true
cloneCodebase: true

Customize cache paths

Cache Intelligence stores the data to be cached in the /harness directory by default. You can use paths to specify a list of locations to be cached. This is useful if:

  • Cache Intelligence is not supported for your build tool.
  • You have customized cache locations, such as with yarn config set cache-folder.
  • You're using a Windows platform.

In the YAML editor, add a list of paths to cache under stage.spec.caching, for example:

- stage:
name: Build
identifier: Build
type: CI
spec:
caching:
enabled: true
paths:
- /harness/node_modules ## On a Windows platform, the path would be 'C:\harness\node_modules'.
cloneCodebase: true

If a cache path is outside the /harness directory, you must also specify this as a shared path. In the YAML editor, add a list of sharedPaths under stage.spec, for example:

- stage:
name: Build
identifier: Build
type: CI
spec:
caching:
enabled: true
paths:
- /harness/node_modules
- /my_cache_directory/module_cache1
cloneCodebase: true
platform:
os: Linux
arch: Amd64
runtime:
type: Cloud
spec: {}
sharedPaths:
- /my_cache_directory/module_cache1

Customize cache keys

Harness generates a cache key from a hash of the build lock file (such as pom.xml, build.gradle, or package.json) that Harness detects. If Harness detects multiple tools or multiple lock files, Harness combines the hashes to create the cache key.

You can define custom cache keys if you don't want to use the default cache key naming behavior or in scenarios that require defining custom cache keys, such as caching in parallel stages.

To customize the cache key in the YAML editor, add key: CUSTOM_KEY_VALUE under stage.spec.caching. You can use fixed values, runtime inputs, and expressions for the key value.

The following YAML example uses runtime input (<+input>), which prompts the user to supply a cache key value at runtime.

- stage:
name: Build
identifier: Build
type: CI
spec:
caching:
enabled: true
key: <+input>
cloneCodebase: true

Cache Intelligence in parallel stages

If you have multiple stages that run in parallel, you must use custom cache keys for each stage that uses Cache Intelligence. This prevents conflicts when the parallel stages attempt to save or retrieve caches concurrently.

If your stage uses a matrix or repeat looping strategy that generates multiple stage instances, you can use a Harness expression to generate unique cache keys, such as key: cachekey-<+strategy.iteration>. The <+strategy.iteration> expressions references the stage's iteration index. Each instance of the stage generated by the matrix/repeat strategy has a different iteration index, starting from 0.

Define cache policy

The cache policy defines how you use caching in a stage.

For example, if your pipeline has two stages, you might want to restore the cache in the first stage and then save the cache in the second stage, rather than both saving and restoring the cache in both stages.

Currently, you can configure Cache Intelligence cache policies in the YAML editor only.

To configure the cache policy, add policy: pull | push | pull-push to stage.spec.caching.

  • policy: pull - Only restore cache.
  • policy: push - Only save cache.
  • policy: pull-push - Save and restore cache. This is the default setting.

For example, here is a pipeline with two Build (CI) stages using Cache Intelligence. The first stage's cache policy is set to pull only, and the second stage's cache policy is set to push only. When this pipeline runs, the first stage restores the build cache, and the second stage saves the cache at the end of the build.

  stages:
- stage:
name: buildStage1
identifier: buildstage1
description: ""
type: CI
spec:
cloneCodebase: true
platform:
os: Linux
arch: Amd64
runtime:
type: Cloud
spec: {}
caching:
enabled: true
policy: pull
execution:
steps:
...
- stage:
name: buildStage2
identifier: buildstage2
description: ""
type: CI
spec:
cloneCodebase: true
platform:
os: Linux
arch: Amd64
runtime:
type: Cloud
spec: {}
caching:
enabled: true
policy: push
execution:
steps:
...

Enable cache override

The cache override allows you to force push the cache even if the cache key hasn't changed.

Currently, you can configure Cache Intelligence cache override in the YAML editor only.

To configure the cache override, add override: true | false to stage.spec.caching.

  • override: true - Always save the cache. Currently, this is the default setting.
  • override: false - Only save the cache if there are changes.

For example:

- stage:
name: Build
identifier: Build
type: CI
spec:
caching:
enabled: true
override: false
cloneCodebase: true

Cache Intelligence API

You can use the Cache Intelligence API to get information about the cache or delete the cache.

API key authentication is required. You need a Harness API key with core_account_edit permission. For more information about API keys, go to Manage API keys. For more information about authentication, go to the Harness API documentation.

Get cache metadata

Get metadata about the cache, such as the size and path.

curl --location --request GET 'https://app.harness.io/gateway/ci/cache/info?accountIdentifier=$YOUR_HARNESS_ACCOUNT_ID' \
--header 'Accept: application/json' \
--header 'X-API-KEY: $API_KEY'

Delete cache

Delete the entire cache, or use the optional path parameter to delete a specific subdirectory in the cache.

curl --location --request DELETE 'https://app.harness.io/gateway/ci/cache?accountIdentifier=$YOUR_HARNESS_ACCOUNT_ID&path=/path/to/deleted/directory' \
--header 'Accept: application/json' \
--header 'X-API-KEY: $API_KEY'

Troubleshoot caching

Go to the CI Knowledge Base for questions and issues related to caching, data sharing, dependency management, workspaces, shared paths, and more. For example: