Semgrep step configuration

Last updated on Oct 28, 2025

You can scan your code repositories using Semgrep and ingest the results into STO.

For a quick introduction, go to the SAST code scans using Semgrep tutorial.

Important notes for running Semgrep scans in STO

This integration uses the Semgrep Engine, which is open-source and licensed under LGPL 2.1.

To run scans using a licensed version of Semgrep Code, add your Semgrep token in the Access token field.
STO Semgrep steps include the following rulesets by default:
- auto
- bandit
- brakeman
- eslint
- findsecbugs
- flawfinder
- gosec
- phps-security-audit
- security-code-scan
Some rulesets include Pro rules that are available only with a paid version of Semgrep. For more information, go to the Semgrep Registry.
If you want to add trusted certificates to your scan images at runtime, you need to run the scan step with root access.

You can set up your STO scan images and pipelines to run scans as non-root and establish trust for your proxies using custom certificates. For more information, go to Configure your pipeline to use STO images from private registry.
The following topics contain useful information for setting up scanner integrations in STO:

Set-up workflows

Orchestration scans

To scan a code repository, you need Harness Code Repository or a Harness connector to your Git service.

Add the Semgrep scanner

Do the following:

Add a Build or Security stage to your pipeline.
Configure the stage to point to the codebase you want to scan.
Add a Semgrep step to the stage.

Set up the Semgrep scanner

Required settings

Scan mode = Orchestration
Target and Variant Detection = Auto

Optional settings

Fail on Severity — Stop the pipeline if the scan detects any issues at a specified severity or higher
Log Level — Useful for debugging

Scan the repository

Save your pipeline and then select Run.

The pipeline scans your code repository and then shows the results in Vulnerabilities tab.

Ingestion scans

Add a shared path for your scan results

Add a Build or Security stage to your pipeline.
In the stage Overview, add a shared path such as /shared/scan_results.

Copy scan results to the shared path

There are two primary workflows to do this:

Add a Run step that runs a Semgrep scan from the command line and then copies the results to the shared path.
Copy results from a Semgrep scan that ran outside the pipeline.

For more information and examples, go to Ingestion scans.

Set up the Semgrep scanner

Add a Semgrep step to the stage and set it up as follows.

Required settings

Scan mode = Ingestion
Target name — Usually the repo name
Target variant — Usually the scanned branch. You can also use a runtime input and specify the branch at runtime.
Ingestion file — For example, /shared/scan_results/semgrep-scan.json

Optional settings

Fail on Severity — Stop the pipeline if the scan detects any issues at a specified severity or higher
Log Level — Useful for debugging

Scan the repository

Save your pipeline and then select Run.

The pipeline scans your code repository and then shows the results in Vulnerabilities tab.

Semgrep step configuration

The recommended workflow is to add a Semgrep step to a Security Tests or CI Build stage and then configure it as described below.

Scan

Scan Mode

Orchestration Configure the step to run a scan and then ingest, normalize, and deduplicate the results.

Ingestion Configure the step to read scan results from a data file and then ingest, normalize, and deduplicate the data.

Scan Configuration

You can use this setting to select the set of Semgrep rulesets to include in your scan:

Default Include the following rulesets:
No default CLI flags Run the semgrep scanner with no additional CLI flags. This setting is useful if you want to specify a custom set of rulesets in Additional CLI flags.
p/default Run the scan with the default ruleset configured for the Semgrep scanner.
Auto only Run the scan with the recommended rulesets specific to your project.
Auto and Ported security tools Include the following rulesets:
- auto
- brakeman
- eslint
- findsecbugs
- flawfinder
- gitleaks
- gosec
- phps-security-audit
- security-code-scan
Auto and Ported security tools except p/gitleaks

Target

Type

Repository Scan a codebase repo.

In most cases, you specify the codebase using a code repo connector that connects to the Git account or repository where your code is stored. For information, go to Configure codebase.

Target and variant detection

When Auto is enabled for code repositories, the step detects these values using git:

To detect the target, the step runs git config --get remote.origin.url.
To detect the variant, the step runs git rev-parse --abbrev-ref HEAD. The default assumption is that the HEAD branch is the one you want to scan.

Note the following:

Auto is not available when the Scan Mode is Ingestion.
By default, Auto is selected when you add the step. You can change this setting if needed.

Name

The identifier for the target, such as codebaseAlpha or jsmith/myalphaservice. Descriptive target names make it much easier to navigate your scan data in the STO UI.

It is good practice to specify a baseline for every target.

Variant

The identifier for the specific variant to scan. This is usually the branch name, image tag, or product version. Harness maintains a historical trend for each variant.

Workspace

The workspace path on the pod running the scan step. The workspace path is /harness by default.

You can override this if you want to scan only a subset of the workspace. For example, suppose the pipeline publishes artifacts to a subfolder /tmp/artifacts and you want to scan these artifacts only. In this case, you can specify the workspace path as /harness/tmp/artifacts.

Additionally, you can specify individual files to scan as well. For instance, if you only want to scan a specific file like /tmp/iac/infra.tf, you can specify the workspace path as /harness/tmp/iac/infra.tf

Ingestion File

The path to your scan results when running an Ingestion scan, for example /shared/scan_results/myscan.latest.sarif.

The data file must be in a supported format for the scanner.
The data file must be accessible to the scan step. It's good practice to save your results files to a shared path in your stage. In the visual editor, go to the stage where you're running the scan. Then go to Overview > Shared Paths. You can also add the path to the YAML stage definition like this:
```
    - stage:
      spec:
        sharedPaths:
          - /shared/scan_results
```

Access Token

The access token to log in to the scanner. This is usually a password or an API key.

You should create a Harness text secret with your encrypted token and reference the secret using the format <+secrets.getValue("my-access-token")>. For more information, go to Add and Reference Text Secrets.

Log Level

The minimum severity of the messages you want to include in your scan logs. You can specify one of the following:

DEBUG
INFO
WARNING
ERROR

Additional CLI flags

Use this field to run the semgrep scanner with flags such as:

--severity=ERROR --use-git-ignore

With these flags, semgrep considers only ERROR severity rules and ignores files included in .gitignore.

caution

Passing additional CLI flags is an advanced feature. Harness recommends the following best practices:

Test your flags and arguments thoroughly before you use them in your Harness pipelines. Some flags might not work in the context of STO.
Don't add flags that are already used in the default configuration of the scan step.

To check the default configuration, go to a pipeline execution where the scan step ran with no additional flags. Check the log output for the scan step. You should see a line like this:

Command [ scancmd -f json -o /tmp/output.json ]

In this case, don't add -f or -o to Additional CLI flags.

Fail on Severity

Every STO scan step has a Fail on Severity setting. If the scan finds any vulnerability with the specified severity level or higher, the pipeline fails automatically. You can specify one of the following:

CRITICAL
HIGH
MEDIUM
LOW
INFO
NONE — Do not fail on severity

Settings

You can use this field to specify environment variables for your scanner.

Additional Configuration

The fields under Additional Configuration vary based on the type of infrastructure. Depending on the infrastructure type selected, some fields may or may not appear in your settings. Below are the details for each field

Override Security Test Image
- Container Registry
- Image Tag
Privileged
Image Pull Policy
Run as User
Set Container Resources
Timeout

Advanced settings

In the Advanced settings, you can use the following options:

Configure Semgrep as a Built-in Scanner

The Semgrep scanner is available as a built-in scanner in STO. Configuring it as a built-in scanner enables the step to automatically perform scans using the free version without requiring any licenses. Follow these steps to set it up:

Search for SAST in the step palette or navigate to the Built-in Scanners section and select the SAST step.
Expand the Additional CLI Flags section if you want to configure optional CLI flags.
Click Add Scanner to save the configuration.

The scanner will automatically use the free version, detect scan targets, and can be further configured by clicking on the step whenever needed.

Proxy settings

This step supports private network connectivity if you're using Harness Cloud infrastructure. For information on connectivity options, see Private network connectivity options. When using proxy configurations, the HTTPS_PROXY and HTTP_PROXY variables are automatically set to route traffic through the secure tunnel. If there are specific addresses that you want to bypass the proxy, you can define those in the NO_PROXY variable. This can be configured in the Settings of your step.

If you need to configure a different proxy, you can manually set the HTTPS_PROXY, HTTP_PROXY, and NO_PROXY variables in the Settings of your step.

Definitions of Proxy variables:

HTTPS_PROXY: Specify the proxy server for HTTPS requests, example https://sc.internal.harness.io:30000
HTTP_PROXY: Specify the proxy server for HTTP requests, example http://sc.internal.harness.io:30000
NO_PROXY: Specify the domains as comma-separated values that should bypass the proxy. This allows you to exclude certain traffic from being routed through the proxy.

YAML pipeline example

The following pipeline example illustrates an orchestration workflow. It consists of a Semgrep step that scans a code repository and then ingests, normalizes, and deduplicates the results.

pipeline:
  name: semgrep-orch-test
  identifier: semgreporchtest
  projectIdentifier: default
  orgIdentifier: default
  tags: {}
  properties:
    ci:
      codebase:
        connectorRef: YOUR_GIT_CONNECTOR_ID
        repoName: YOUR_GIT_REPO_NAME
        build: <+input>
  stages:
    - stage:
        name: semgrep-orch
        identifier: semgreporch
        description: ""
        type: SecurityTests
        spec:
          cloneCodebase: true
          platform:
            os: Linux
            arch: Amd64
          runtime:
            type: Cloud
            spec: {}
          execution:
            steps:
              - step:
                  type: Semgrep
                  name: Semgrep_1
                  identifier: Semgrep_1
                  spec:
                    mode: orchestration
                    config: default
                    target:
                      type: repository
                      detection: auto
                    advanced:
                      log:
                        level: info

Important notes for running Semgrep scans in STO​

Set-up workflows​

Add the Semgrep scanner​

Set up the Semgrep scanner​

Required settings​

Optional settings​

Scan the repository​

Add a shared path for your scan results​

Copy scan results to the shared path​

Set up the Semgrep scanner​

Required settings​

Optional settings​

Scan the repository​

Semgrep step configuration​

Scan​

Scan Mode​

Scan Configuration​

Target​

Type​

Target and variant detection​

Name​

Variant​

Workspace​

Ingestion File​

Access Token​

Log Level​

Additional CLI flags​

Fail on Severity​

Settings​

Additional Configuration​

Advanced settings​

Configure Semgrep as a Built-in Scanner​

Proxy settings​

Definitions of Proxy variables:​

YAML pipeline example​

Important notes for running Semgrep scans in STO

Set-up workflows

Add the Semgrep scanner

Set up the Semgrep scanner

Required settings

Optional settings

Scan the repository

Add a shared path for your scan results

Copy scan results to the shared path

Set up the Semgrep scanner

Required settings

Optional settings

Scan the repository

Semgrep step configuration

Scan

Scan Mode

Scan Configuration

Target

Type

Target and variant detection

Name

Variant

Workspace

Ingestion File

Access Token

Log Level

Additional CLI flags

Fail on Severity

Settings

Additional Configuration

Advanced settings

Configure Semgrep as a Built-in Scanner

Proxy settings

Definitions of Proxy variables:

YAML pipeline example