AI SRE Onboarding Guide for Administrators
This guide introduces you to the capabilities of Harness AI SRE, providing a comprehensive approach to proactively managing and resolving incidents with real-time insights, alerts, and seamless integration.
When you configure AI SRE in Harness, you orchestrate intelligent incident detection, automated response workflows, and collaborative resolution processes across your monitoring and communication tools.
Prerequisites
Before beginning the walkthroughs in this guide, ensure you have:
| Item | Details / Link |
|---|---|
| Harness account | AI SRE Feature flag enabled (contact your sales representative or reach out to the team at ai-sre-support@harness.io) |
| Monitoring tools | Integration with monitoring systems like Datadog, New Relic, or Grafana |
| Communication platforms | Slack, Microsoft Teams, or Zoom for incident collaboration |
| On-call management | PagerDuty, OpsGenie, or similar on-call scheduling tools (optional) |
Go to What's supported with Harness AI SRE for a full list of supported monitoring & observability tools, communication & collaboration platforms, and on-call & escalation management tools.
Onboarding Steps
Follow these steps to get started with AI SRE:
- Integrate Tools, Connect your collaboration and monitoring tools
- Set Up Incident Types, Define incident types and severity levels
- Configure Webhooks, Enable external tools to create alerts
- Create Runbooks, Automate incident response workflows
- Expression Languages, Learn about CEL and Mustache for dynamic content
Next Steps
After completing the onboarding steps, enhance your incident response capabilities:
- Go to Managing Incidents in Slack to use Slack slash commands for incident management.
- Go to Advanced Runbooks to build sophisticated automation workflows.
- Go to Integration Library to connect with ServiceNow, Jira, and other ITSM tools.
- Go to AI Scribe Agent to leverage AI-powered documentation and insights.