Notification Channels¶
Architecture reference
For the CRD specification, delivery orchestration, and retry internals, see Architecture: Notification Pipeline.
Kubernaut sends notifications at key points in the remediation lifecycle: when human approval is required, when a remediation fails, when manual review is needed, and when a remediation completes. Notifications are routed through configurable channels using an AlertManager-style routing configuration.
Cluster context: Message bodies include the cluster display name and UUID near the top (Cluster: <name> (<uuid>)) on every channel, including timeout notifications.
Channel Overview¶
| Channel | Status | Description |
|---|---|---|
| Console | Implemented | Writes to controller-runtime log output (stdout) |
| File | Implemented | Writes notification JSON/YAML to files (E2E/testing) |
| Log | Implemented | Structured JSON Lines to stdout for log aggregation |
| Slack | Implemented | Sends Block Kit messages via Incoming Webhooks |
| PagerDuty | Implemented (v1.4) | Events API v2 delivery with circuit breaker |
| Microsoft Teams | Implemented (v1.4) | Adaptive Card delivery with circuit breaker |
| Schema-defined | Not yet implemented | |
| SMS | Schema-defined | Not yet implemented |
| Webhook | Schema-defined | Not yet implemented |
Workflow name enrichment
Notification bodies automatically resolve workflow UUIDs to human-readable workflow names (e.g., "RollbackDeployment" instead of a UUID) when the workflow exists in the catalog. If resolution fails, the original UUID is preserved. See Architecture: Notification Enrichment for details.
Routing Configuration¶
Notifications are routed using a ConfigMap with an AlertManager-style route + receivers structure.
ConfigMap: notification-routing-config
When neither notification.routing.content nor notification.routing.existingConfigMap is set, the chart generates a default routing config based on Helm values:
Default when notification.slack.secretName is set (Slack + console):
route:
receiver: slack-and-console
receivers:
- name: slack-and-console
consoleConfigs:
- enabled: true
slackConfigs:
- channel: "#kubernaut-alerts"
credentialRef: slack-webhook
Default when notification.slack.secretName is empty (console only):
The catch-all receiver routes all notification types to the configured channels. New types added in future releases are automatically covered. Avoid matching specific types unless you intentionally want to suppress certain notifications from a channel. See Notification Routing ConfigMap for the complete reference.
Match Fields¶
Routes match on notification attributes:
| Match Key | Source | Example Values |
|---|---|---|
type |
Notification type | Escalation, Simple, StatusUpdate, Approval, ManualReview, Completion |
severity |
Signal severity | critical, high, medium, low |
priority |
Notification priority | Critical, High, Medium, Low |
phase |
Remediation phase | signal-processing, ai-analysis, executing |
environment |
Namespace environment | production, staging, development |
review-source |
Why review was triggered | WorkflowResolutionFailed, ExhaustedRetries |
Match key naming
Routing match keys use kebab-case (e.g., review-source) in the YAML routing configuration. The architecture reference documents the same attributes using their Go struct field names (e.g., reviewSource). Both refer to the same underlying attribute.
Routing Logic¶
- First matching route wins (depth-first evaluation)
- Child routes are evaluated before the parent
- The default receiver is used when no route matches
- Routing configuration supports hot-reload via
fsnotifyFileWatcher (#244) — when the projected ConfigMap volume is synced by the kubelet (~60s), the watcher detects the change and reloads routing rules without pod restart
Per-Channel Setup¶
Console¶
Console delivery writes notifications to the controller log via controller-runtime. Enabled by default for local development.
Helm configuration:
Routing receiver:
File¶
File delivery writes the full NotificationRequest content to files. Primarily useful for E2E testing and debugging.
Helm configuration:
Files are written atomically (temp file then rename) with the naming pattern:
Log¶
Log delivery sends structured notifications to stdout as JSON Lines, suitable for ingestion by Loki, Elasticsearch, or similar log aggregation systems.
Helm configuration:
JSON output format:
{
"timestamp": "2026-03-04T12:34:56Z",
"notification_name": "approval-required-rr-12345",
"notification_namespace": "kubernaut-system",
"type": "Approval",
"priority": "Critical",
"subject": "Human approval required for OOMKilled remediation",
"body": "...",
"metadata": {"environment": "production"},
"phase": "ai-analysis"
}
Text format: [timestamp] namespace/name subject: body
Slack¶
Slack delivery sends Block Kit messages via Incoming Webhooks.
Message format:
- Header block -- Priority emoji + subject (e.g.,
:rotating_light: Human approval required for OOMKilled remediation) - Section block -- Notification body (Markdown converted to Slack mrkdwn)
- Context block --
*Priority:* Critical | *Type:* Approval
Priority emojis: Critical = , High =
, Medium =
, Low =
Routing receiver:
receivers:
- name: slack-alerts
slackConfigs:
- channel: "#kubernaut-alerts"
credentialRef: slack-webhook # References a mounted credential
| Field | Description |
|---|---|
channel |
Slack channel to post to |
credentialRef |
Name of the credential file containing the webhook URL |
username |
Optional bot username override |
iconEmoji |
Optional icon emoji override |
Credential Management¶
Notification credentials (webhook URLs, API tokens) are managed via Kubernetes Secrets mounted as projected volumes.
How It Works¶
- Create a Kubernetes Secret with the credential value
- Configure the Helm chart to project the Secret into the notification pod
- Reference the credential name in the routing configuration
Step-by-Step: Slack Webhook¶
1. Create the Secret:
kubectl create secret generic slack-webhook \
--namespace kubernaut-system \
--from-literal=webhook-url="https://hooks.slack.com/services/T.../B.../xxx"
2. Configure Helm values:
notification:
routing:
content: "" # Or provide via --set-file notification.routing.content=routing.yaml
credentials:
- name: slack-webhook # Credential name (used in routing credentialRef)
secretName: slack-webhook # Kubernetes Secret name
secretKey: webhook-url # Key within the Secret
3. Reference in routing:
receivers:
- name: slack-alerts
slackConfigs:
- channel: "#kubernaut-alerts"
credentialRef: slack-webhook # Matches credential name above
Directory Structure¶
Credentials are mounted at /etc/notification/credentials/:
Each credential is a single file where the filename is the credential name and the content is the secret value.
Hot-Reload¶
Credentials support hot-reload via fsnotify. When a Secret is updated, the kubelet syncs the projected volume (~60s), and the file watcher detects the change and reloads the credential cache. No pod restart required.
Retry Policy¶
The notification controller uses exponential backoff with retry and circuit breaker logic. See Architecture: Notification Pipeline for retry defaults and error classification.
Per-Notification Override¶
The retry policy can be overridden per NotificationRequest via the spec.retryPolicy field:
spec:
retryPolicy:
maxAttempts: 3
initialBackoffSeconds: 10
backoffMultiplier: 2
maxBackoffSeconds: 120
Routing Block and Terminal Failure Notifications¶
Since v1.3, Kubernaut emits notifications for block reasons and terminal failures that were previously silent. Operators should ensure routing rules cover these NR types.
Escalation NRs (block reasons and terminal failures)¶
Match type: Escalation to capture:
- Block-reason escalations (
nr-block-consecutivefailures-*,nr-block-unmanagedresource-*) -- persistent blocks requiring operator investigation - Terminal failure escalations (
nr-escalation-*) -- failure paths that previously had no notification
These are High priority. Route to your primary ops investigation channel.
StatusUpdate NRs (transient blocks)¶
Match type: StatusUpdate with priority: Low to capture transient block notifications (DuplicateInProgress, ResourceBusy, RecentlyRemediated, ExponentialBackoff). These are informational -- route to a low-priority channel or suppress in high-traffic environments.
ManualReview NRs by source¶
ManualReview NRs can be further distinguished by review-source to route different failure types to different teams:
review-source |
Meaning | Suggested action |
|---|---|---|
WorkflowExecution |
Execution failure | Ops investigation |
AIAnalysis |
AI couldn't recommend a workflow | Catalog update / workflow authoring |
RoutingEngine |
Repeated ineffective remediations | Root cause investigation |
routes:
- match:
type: ManualReview
review-source: WorkflowExecution
receiver: ops-failures-channel
- match:
type: ManualReview
review-source: AIAnalysis
receiver: workflow-authors-channel
- match:
type: ManualReview
review-source: RoutingEngine
receiver: ops-escalation-channel
See Notification Pipeline: NR Naming Conventions for the complete NR naming catalog.
Enabling Slack: End-to-End Walkthrough¶
-
Create a Slack Incoming Webhook in your workspace (Apps > Incoming Webhooks > Add to Channel)
-
Create the Kubernetes Secret:
-
Provide a routing config (via
--set-fileor in a values file):helm upgrade kubernaut charts/kubernaut \ --namespace kubernaut-system \ --set-file notification.routing.content=charts/kubernaut/examples/notification-routing.yaml \ -f values.yamlEnsure your values file includes the credential mount:
-
Verify the notification pod has the credential mounted:
Slack notifications will now be sent for any route that uses a slackConfigs receiver.
PagerDuty Setup (v1.4)¶
PagerDuty delivery uses the Events API v2 to create incidents from Kubernaut notifications. The adapter includes a circuit breaker for graceful degradation under PagerDuty outages.
Configuration¶
-
Create the routing key secret:
-
Add the credential mount in Helm values:
-
Add a PagerDuty receiver to the routing config:
-
Route notifications to PagerDuty:
Microsoft Teams Setup (v1.4)¶
Microsoft Teams delivery sends Adaptive Card messages via incoming webhooks. The adapter includes a circuit breaker matching the pattern used by Slack and PagerDuty.
Configuration¶
-
Create the webhook URL secret:
-
Add the credential mount in Helm values:
-
Add a Teams receiver to the routing config:
-
Route notifications to Teams:
The teams channel is available as an audit channel enum value for audit event routing.
Next Steps¶
- Architecture: Notification Pipeline -- CRD specification, delivery orchestration, and retry internals
- Configuration Reference -- Full operator configuration reference
- Human Approval -- The approval notification flow
- Rego Policies -- Policies that influence notification triggers