Skip to main content

AI SRE Onboarding Guide for Administrators

This guide introduces you to the capabilities of Harness AI SRE, providing a comprehensive approach to proactively managing and resolving incidents with real-time insights, alerts, and seamless integration.

When you configure AI SRE in Harness, you orchestrate intelligent incident detection, automated response workflows, and collaborative resolution processes across your monitoring and communication tools.

Prerequisites

Before beginning the walkthroughs in this guide, ensure you have:

ItemDetails / Link
Harness accountAI SRE Feature flag enabled (contact your sales representative or reach out to the team at ai-sre-support@harness.io)
Monitoring toolsIntegration with monitoring systems like Datadog, New Relic, or Grafana
Communication platformsSlack, Microsoft Teams, or Zoom for incident collaboration
On-call managementPagerDuty, OpsGenie, or similar on-call scheduling tools (optional)
supported tools & platforms

Go to What's supported with Harness AI SRE for a full list of supported monitoring & observability tools, communication & collaboration platforms, and on-call & escalation management tools.


Onboarding Steps

Follow these steps to get started with AI SRE:

  1. Integrate Tools, Connect your collaboration and monitoring tools
  2. Set Up Incident Types, Define incident types and severity levels
  3. Configure Webhooks, Enable external tools to create alerts
  4. Create Runbooks, Automate incident response workflows
  5. Expression Languages, Learn about CEL and Mustache for dynamic content

Next Steps

After completing the onboarding steps, enhance your incident response capabilities: