Skip to main content

Semgrep with self-hosted Ubuntu runners in Azure Pipelines

Semgrep provides a sample configuration for Azure-hosted runners. If you use self-hosted Ubuntu Linux runners, you have significantly more control over their configuration, but as a result, they require additional preparation and configuration to run Semgrep.

This guide adds two approaches to configuring self-hosted runners that use Ubuntu (the default self-hosted option for Azure DevOps Linux runners):

Both pipx and uv install Semgrep into an isolated environment, which avoids issues with system-managed Python vs user-installed Python.

Using pipx

pipx installs standalone Python applications into isolated environments. This is the recommended approach for installing Semgrep on a self-hosted runner.

Prepare your runner

Access the runner and execute the following commands:

$ sudo apt update
$ sudo apt install pipx
$ pipx ensurepath

After completing the commands:

  1. Start a new shell session, so that the changes from pipx ensurepath are available.
  2. Ensure the Azure DevOps agent is set up and running.

Create your configuration

  1. Follow the steps provided in the sample configuration for Azure-hosted runners.
  2. Add the following snippet to the azure-pipelines.yml for the repository.
variables:
- group: Semgrep_Variables

pool:
name: Default

steps:
- checkout: self
clean: true
fetchDepth: 20
persistCredentials: true
- script: |
pipx install semgrep
if [ $(Build.SourceBranchName) = "master" ]; then
echo "Semgrep full scan"
semgrep ci
elif [ $(System.PullRequest.PullRequestId) -ge 0 ]; then
echo "Semgrep diff scan"
git fetch origin master:origin/master
export SEMGREP_PR_ID=$(System.PullRequest.PullRequestId)
export SEMGREP_BASELINE_REF='origin/master'
semgrep ci
fi
env:
SEMGREP_APP_TOKEN: $(SEMGREP_APP_TOKEN)
Customizing the configuration
  • If your self-hosted runner agent pool has a different name, update the name key under pool to match the desired agent pool.
  • If your default branch is not called master, update the references to master to match the name of your default branch.

Set environment variables in Azure Pipelines

Semgrep minimally requires the variable SEMGREP_APP_TOKEN in order to report results to the platform, and other variables may be helpful as well. To set these variables in Azure Pipelines:

  1. Set up a variable group called Semgrep_Variables.
  2. Set SEMGREP_APP_TOKEN in the variable group, following the steps for secret variables. The variable is mapped into the env in the provided config.
  3. Optional: Add the following environment variables to the group if you aren't seeing hyperlinks to the code that generated a finding, or if you are not receiving PR or MR comments. Review the use of these variables at Environment variables for creating hyperlinks in Semgrep AppSec Platform.These variables are not sensitive and do not need to be secret variables.
    • SEMGREP_REPO_NAME
    • SEMGREP_REPO_URL
    • SEMGREP_BRANCH
    • SEMGREP_COMMIT
    • SEMGREP_JOB_URL
  4. Set variables for diff-aware scanning. The provided config sets SEMGREP_PR_ID to the system variable System.PullRequest.PullRequestId and SEMGREP_BASELINE_REF to origin/master within the script section of the config. The value of SEMGREP_BASELINE_REF is typically your trunk or default branch, so if you use a different branch than master, update the name accordingly. as main or master.
    • If you prefer not to implement diff-aware scanning, you can skip setting these variables and remove the elif section of the script step.
  5. For diff-aware scans: add a build validation policy. Adding and enabling a branch policy for build validation is required to trigger Azure Pipelines on pull requests.

Using uv

Prepare your runner

uv is a fast Python package and project manager. Its uv tool install command installs standalone Python applications into isolated environments, similar to pipx.

Access the runner and install uv following Astral's installation instructions, for example:

$ curl -LsSf https://astral.sh/uv/install.sh | sh

After installing, ensure the Azure DevOps agent is set up and running.

Create your configuration

Add the following snippet to the azure-pipelines.yml for the repository.

variables:
- group: Semgrep_Variables

pool:
name: Default

steps:
- checkout: self
clean: true
fetchDepth: 20
persistCredentials: true
- script: |
uv tool install semgrep
if [ $(Build.SourceBranchName) = "master" ]; then
echo "Semgrep full scan"
semgrep ci
elif [ $(System.PullRequest.PullRequestId) -ge 0 ]; then
echo "Semgrep diff scan"
git fetch origin master:origin/master
export SEMGREP_PR_ID=$(System.PullRequest.PullRequestId)
export SEMGREP_BASELINE_REF='origin/master'
semgrep ci
fi
env:
SEMGREP_APP_TOKEN: $(SEMGREP_APP_TOKEN)
Customizing the configuration
  • If your self-hosted runner agent pool has a different name, update the name key under pool to match the desired agent pool.
  • If your default branch is not called master, update the references to master to match the name of your default branch.

Set environment variables in Azure Pipelines

Semgrep minimally requires the variable SEMGREP_APP_TOKEN in order to report results to the platform, and other variables may be helpful as well. To set these variables in Azure Pipelines:

  1. Set up a variable group called Semgrep_Variables.
  2. Set SEMGREP_APP_TOKEN in the variable group, following the steps for secret variables. The variable is mapped into the env in the provided config.
  3. Optional: Add the following environment variables to the group if you aren't seeing hyperlinks to the code that generated a finding, or if you are not receiving PR or MR comments. Review the use of these variables at Environment variables for creating hyperlinks in Semgrep AppSec Platform.These variables are not sensitive and do not need to be secret variables.
    • SEMGREP_REPO_NAME
    • SEMGREP_REPO_URL
    • SEMGREP_BRANCH
    • SEMGREP_COMMIT
    • SEMGREP_JOB_URL
  4. Set variables for diff-aware scanning. The provided config sets SEMGREP_PR_ID to the system variable System.PullRequest.PullRequestId and SEMGREP_BASELINE_REF to origin/master within the script section of the config. The value of SEMGREP_BASELINE_REF is typically your trunk or default branch, so if you use a different branch than master, update the name accordingly. as main or master.
    • If you prefer not to implement diff-aware scanning, you can skip setting these variables and remove the elif section of the script step.
  5. For diff-aware scans: add a build validation policy. Adding and enabling a branch policy for build validation is required to trigger Azure Pipelines on pull requests.

Not finding what you need in this doc? Ask questions in our Community Slack group, or see Support for other ways to get help.