Skip to content

Custom Resources (CRDs)

Kubernaut API reference for all Custom Resource Definitions.

API Group: kubernaut.ai/v1alpha1

AIAnalysis

AIAnalysis is the Schema for the aianalyses API.

Field Type Description
apiVersion string kubernaut.ai/v1alpha1
kind string AIAnalysis
metadata ObjectMeta Refer to the Kubernetes API documentation for fields of metadata.
spec AIAnalysisSpec
status AIAnalysisStatus

AIAnalysisReason

Underlying type: string

AIAnalysisReason represents the umbrella failure or completion reason.

Appears in: - AIAnalysisStatus

Validation: - Enum: [AnalysisCompleted WorkflowResolutionFailed WorkflowNotNeeded NoWorkflowSelected RegoEvaluationError TransientError APIError]

Value Description
AnalysisCompleted
WorkflowResolutionFailed
WorkflowNotNeeded
NoWorkflowSelected
RegoEvaluationError
TransientError
APIError

AIAnalysisSpec

AIAnalysisSpec defines the desired state of AIAnalysis.

Spec Immutability AIAnalysis represents an immutable event (AI investigation). Once created by RemediationOrchestrator, spec cannot be modified to ensure: - Audit trail integrity (AI investigation matches original RCA request) - No tampering with RCA targets post-HAPI validation - No workflow selection modification after AI recommendation

To re-analyze, delete and recreate the AIAnalysis CRD.

Appears in: - AIAnalysis

Field Type Description
remediationRequestRef ObjectReference Reference to parent RemediationRequest CRD for audit trail
remediationId string Remediation ID for audit correlation
analysisRequest AnalysisRequest Complete analysis request with structured context
timeoutConfig AIAnalysisTimeoutConfig TIMEOUT CONFIGURATION
Replaces deprecated annotation-based timeout (security + validation)
Passed through from RR.Status.TimeoutConfig.AIAnalysisTimeout by RO ( moved to Status)
Optional timeout configuration for this analysis
If nil, AIAnalysis controller uses defaults (Investigating: 60s, Analyzing: 5s)

AIAnalysisStatus

AIAnalysisStatus defines the observed state of AIAnalysis.

Appears in: - AIAnalysis

Field Type Description
observedGeneration integer ObservedGeneration is the most recent generation observed by the controller.
Used to prevent duplicate reconciliations and ensure idempotency.
Per Standard pattern for all Kubernetes controllers.
phase string Phase tracking (no "Approving" or "Recommending" phase - simplified 4-phase flow)
message string
reason AIAnalysisReason Reason provides the umbrella failure or completion category.
subReason string SubReason provides specific failure cause within the Reason category
Maps to needs_human_review triggers from HolmesGPT-API
Added InvestigationInconclusive, ProblemResolved for new investigation outcomes
startedAt Time Timestamps
completedAt Time
rootCause string Identified root cause
rootCauseAnalysis RootCauseAnalysis Root cause analysis details
selectedWorkflow SelectedWorkflow Selected workflow for execution (populated when phase=Completed)
alternativeWorkflows AlternativeWorkflow array ALTERNATIVE WORKFLOWS
Alternative workflows considered but not selected.
INFORMATIONAL ONLY - NOT for automatic execution.
Helps operators make informed approval decisions and provides audit trail.
Per HolmesGPT-API team: Alternatives are for CONTEXT, not EXECUTION.
approvalRequired boolean True if approval is required (confidence < 80% or policy requires)
approvalReason string Reason why approval is required (when ApprovalRequired=true)
approvalContext ApprovalContext Rich context for approval notification
needsHumanReview boolean Set by HAPI when AI cannot produce reliable result
True if human review required (HAPI decision: RCA incomplete/unreliable)
Triggers NotificationRequest creation in RO
BR-496 v2: Set when root_owner missing (rca_incomplete) or validation/confidence issues.
humanReviewReason string Reason why human review needed (when NeedsHumanReview=true)
Maps to HAPI's human_review_reason enum values
actionability string #388: LLM's assessment of whether the alert warrants action.
Empty when not yet assessed (pre-investigation or error paths).
"Actionable" when the LLM determines the alert warrants action (default for all processed alerts).
"NotActionable" when the LLM determines the alert is benign (e.g., orphaned PVCs).
investigationId string HolmesGPT investigation ID for correlation
investigationTime integer Investigation duration in seconds
warnings string array Non-fatal warnings from HolmesGPT-API (e.g., low confidence)
validationAttemptsHistory ValidationAttempt array ValidationAttemptsHistory contains complete history of all HAPI validation attempts
Per HAPI retries up to 3 times with LLM self-correction
This field provides audit trail for operator notifications and debugging
degradedMode boolean DegradedMode indicates if the analysis ran with degraded capabilities
(e.g., Rego policy evaluation failed, using safe defaults)
totalAnalysisTime integer TotalAnalysisTime is the total duration of the analysis in seconds
consecutiveFailures integer ConsecutiveFailures tracks retry attempts for exponential backoff
Reset to 0 on success, increment on transient failure
Used with for retry logic with jitter
investigationSession InvestigationSession Tracks the async submit/poll session with HAPI
InvestigationSession tracks the async HAPI session for submit/poll pattern
postRCAContext PostRCAContext Runtime-computed cluster characteristics from HAPI
PostRCAContext holds data computed by HAPI after RCA (e.g., DetectedLabels).
Immutable once set — use CEL validation on the PostRCAContext type.
conditions Condition array Conditions

AIAnalysisTimeoutConfig

AIAnalysisTimeoutConfig defines timeout settings for AIAnalysis phases

Appears in: - AIAnalysisSpec

Field Type Description
investigatingTimeout Duration Timeout for Investigating phase (HolmesGPT-API call)
Default: 60s if not specified
analyzingTimeout Duration Timeout for Analyzing phase (Rego policy evaluation)
Default: 5s if not specified

ActionLink represents an external service action link

Appears in: - NotificationRequestSpec

Field Type Description
service ActionLinkServiceType Service name (github, grafana, prometheus, kubernetes-dashboard, etc.)
url string Action link URL
label string Human-readable label for the link

ActionLinkServiceType

Underlying type: string

Appears in: - ActionLink

Value Description
grafana
prometheus

ActionType

ActionType is the Schema for the actiontypes API. Kubernetes-native action type taxonomy definition.

Field Type Description
apiVersion string kubernaut.ai/v1alpha1
kind string ActionType
metadata ObjectMeta Refer to the Kubernetes API documentation for fields of metadata.
spec ActionTypeSpec
status ActionTypeStatus

ActionTypeDescription

ActionTypeDescription provides structured information about an action type.

Appears in: - ActionTypeSpec

Field Type Description
what string What describes what this action type concretely does.
whenToUse string WhenToUse describes conditions under which this action type is appropriate.
whenNotToUse string WhenNotToUse describes specific exclusion conditions.
preconditions string Preconditions describes conditions that must be verified before use.

ActionTypeSpec

ActionTypeSpec defines the desired state of ActionType. ActionType CRD lifecycle management.

Appears in: - ActionType

Field Type Description
name string Name is the PascalCase action type identifier (e.g., RestartPod, ScaleReplicas).
Immutable after creation.
description ActionTypeDescription Description provides structured information about the action type.
Only this field is mutable after creation.

ActionTypeStatus

ActionTypeStatus defines the observed state of ActionType.

Appears in: - ActionType

Field Type Description
registered boolean Registered indicates whether the action type has been successfully registered in the DS catalog.
registeredAt Time RegisteredAt is the timestamp of initial registration in the catalog.
registeredBy string RegisteredBy is the identity of the registrant (K8s SA or user).
previouslyExisted boolean PreviouslyExisted indicates if this action type was re-enabled after being disabled.
activeWorkflowCount integer ActiveWorkflowCount is the number of active RemediationWorkflows referencing this action type.
Best-effort, updated asynchronously by the RW admission webhook handler.
catalogStatus CatalogStatus CatalogStatus reflects the DS catalog lifecycle state.

AlternativeApproach

AlternativeApproach describes an alternative approach with pros/cons

Appears in: - ApprovalContext

Field Type Description
approach string Approach description
prosCons string ProsCons analysis

AlternativeWorkflow

AlternativeWorkflow contains alternative workflows considered but not selected. INFORMATIONAL ONLY - NOT for automatic execution. Helps operators understand AI reasoning during approval decisions.

Appears in: - AIAnalysisStatus

Field Type Description
workflowId string Workflow identifier (catalog lookup key)
executionBundle string Execution bundle OCI reference (digest-pinned) - resolved by HolmesGPT-API
confidence float Confidence score (0.0-1.0) - shows why it wasn't selected
rationale string Rationale explaining why this workflow was considered

AnalysisContext

AnalysisContext captures AI analysis results.

Appears in: - NotificationContext

Field Type Description
approvalReason string ApprovalReason explains why approval was required.
rootCause string RootCause is the AI-determined root cause summary.
outcome string Outcome is the remediation outcome (e.g., "Success", "Failed").

AnalysisRequest

AnalysisRequest contains the structured analysis request Self-contained context for AIAnalysis

Appears in: - AIAnalysisSpec

Field Type Description
signalContext SignalContextInput Signal context from SignalProcessing enrichment
analysisTypes AnalysisType array Analysis types to perform

AnalysisType

Underlying type: string

AnalysisType represents a type of analysis to perform.

Appears in: - AnalysisRequest

Validation: - Enum: [Investigation RootCause WorkflowSelection]

Value Description
Investigation
RootCause
WorkflowSelection

ApprovalAlternative

ApprovalAlternative describes an alternative approach with pros/cons

Appears in: - RemediationApprovalRequestSpec

Field Type Description
approach string Alternative approach description
prosCons string Pros and cons analysis

ApprovalContext

ApprovalContext contains rich context for approval notifications

Appears in: - AIAnalysisStatus

Field Type Description
reason string Reason why approval is required
confidenceScore float ConfidenceScore from AI analysis (0.0-1.0)
confidenceLevel string ConfidenceLevel: "low" | "medium" | "high"
investigationSummary string InvestigationSummary from HolmesGPT analysis
evidenceCollected string array EvidenceCollected that led to this conclusion
recommendedActions RecommendedAction array RecommendedActions with rationale
alternativesConsidered AlternativeApproach array AlternativesConsidered with pros/cons
whyApprovalRequired string WhyApprovalRequired explains the need for human review
policyEvaluation PolicyEvaluation PolicyEvaluation contains Rego policy evaluation details

ApprovalDecision

Underlying type: string

ApprovalDecision represents the operator's decision on an approval request

Appears in: - RemediationApprovalRequestStatus

Validation: - Enum: [ Approved Rejected Expired]

Value Description
`` ApprovalDecisionPending indicates no decision has been made yet
Approved ApprovalDecisionApproved indicates the operator approved the remediation
Rejected ApprovalDecisionRejected indicates the operator rejected the remediation
Expired ApprovalDecisionExpired indicates the approval request timed out

ApprovalPolicyEvaluation

ApprovalPolicyEvaluation contains Rego policy evaluation results

Appears in: - RemediationApprovalRequestSpec

Field Type Description
policyName string Policy name that was evaluated
matchedRules string array Rules that matched and triggered approval requirement
decision string Policy decision (PascalCase per K8s enum convention, values from PolicyDecision type)

ApprovalRecommendedAction

ApprovalRecommendedAction describes a recommended action with rationale

Appears in: - RemediationApprovalRequestSpec

Field Type Description
action string Action description
rationale string Rationale for this action

BlockClearanceDetails

BlockClearanceDetails tracks the clearing of PreviousExecutionFailed blocks Required for SOC2 CC7.3 (Immutability), CC7.4 (Completeness), CC8.1 (Attribution) Preserves audit trail when operators clear execution blocks after investigation

Appears in: - WorkflowExecutionStatus

Field Type Description
clearedAt Time ClearedAt is the timestamp when the block was cleared
clearedBy string ClearedBy is the Kubernetes user who cleared the block
Extracted from request context (if available) or annotation value
Format: username@domain or service-account:namespace:name
Example: "admin@kubernaut.ai" or "service-account:kubernaut-system:operator"
clearReason string ClearReason is the operator-provided reason for clearing
Required for audit trail accountability
Example: "manual investigation complete, cluster state verified"
clearMethod string ClearMethod indicates how the block was cleared
Annotation: Via kubernaut.ai/clear-execution-block annotation
APIEndpoint: Via dedicated clearing API endpoint (future)
StatusField: Via direct status field update (future)

BlockReason

Underlying type: string

BlockReason represents the reason why a RemediationRequest is blocked (non-terminal).

Appears in: - RemediationRequestStatus

Validation: - Enum: [ConsecutiveFailures DuplicateInProgress ResourceBusy RecentlyRemediated ExponentialBackoff UnmanagedResource IneffectiveChain]

Value Description
ConsecutiveFailures BlockReasonConsecutiveFailures indicates remediation failed 3+ times consecutively.
This is a temporary block with a 1-hour cooldown period.
DuplicateInProgress BlockReasonDuplicateInProgress indicates another RR with the same fingerprint is active.
This prevents Gateway RR flood by keeping the duplicate in non-terminal Blocked state.
ResourceBusy BlockReasonResourceBusy indicates another WorkflowExecution is running on the same target.
This prevents concurrent modifications to the same Kubernetes resource.
RecentlyRemediated BlockReasonRecentlyRemediated indicates the same workflow+target was executed recently.
This enforces a cooldown period (default 5 minutes) to prevent redundant executions.
ExponentialBackoff BlockReasonExponentialBackoff indicates pre-execution failures require a backoff period.
This implements graduated retry for transient infrastructure failures.
UnmanagedResource BlockReasonUnmanagedResource indicates the target resource is not managed by Kubernaut.
The resource or namespace does not have the kubernaut.ai/managed=true label.
RO will retry with exponential backoff (5s → 10s → ... → 5min) until RR times out.
IneffectiveChain BlockReasonIneffectiveChain indicates consecutive remediations for the same target
have been ineffective (resource keeps reverting or health doesn't improve).
Escalates to human review via NotificationRequest.

DedupContext

DedupContext captures deduplication context .

Appears in: - NotificationContext

Field Type Description
duplicateCount string DuplicateCount is the number of duplicate signals.

DeduplicationStatus

DeduplicationStatus tracks signal occurrence for deduplication. OWNER: Gateway Service (exclusive write access)

Appears in: - RemediationRequestStatus

Field Type Description
firstSeenAt Time FirstSeenAt is when this signal fingerprint was first observed
lastSeenAt Time LastSeenAt is when this signal fingerprint was last observed
occurrenceCount integer OccurrenceCount tracks how many times this signal has been seen

DeliveryAttempt

DeliveryAttempt records a single delivery attempt to a channel

Appears in: - NotificationRequestStatus

Field Type Description
channel DeliveryChannelName Channel name
attempt integer Attempt number (1-based)
timestamp Time Timestamp of this attempt
status DeliveryAttemptStatus Status of this attempt (success, failed, timeout, invalid)
error string Error message if failed
durationSeconds float Duration of delivery attempt in seconds

DeliveryAttemptStatus

Underlying type: string

Appears in: - DeliveryAttempt

Validation: - Enum: [success failed timeout invalid]

Value Description
success
failed
timeout
invalid

DeliveryChannelName

Underlying type: string

Appears in: - DeliveryAttempt

EAComponents

EAComponents tracks the completion state and scores of each assessment component. The EM updates these fields as each component check completes. This enables restart recovery: if EM restarts mid-assessment, it can skip already-completed components by checking these flags.

Appears in: - EffectivenessAssessmentStatus

Field Type Description
healthAssessed boolean HealthAssessed indicates whether the health check has been completed.
healthScore float HealthScore is the health check score (0.0-1.0), nil if not yet assessed.
hashComputed boolean HashComputed indicates whether the spec hash comparison has been completed.
postRemediationSpecHash string PostRemediationSpecHash is the hash of the target resource spec after remediation.
currentSpecHash string CurrentSpecHash is the most recent hash of the target resource spec,
re-computed on each reconcile after HashComputed is true .
If it differs from PostRemediationSpecHash, spec drift was detected.
alertAssessed boolean AlertAssessed indicates whether the alert resolution check has been completed.
alertScore float AlertScore is the alert resolution score (0.0 or 1.0), nil if not yet assessed.
metricsAssessed boolean MetricsAssessed indicates whether the metric comparison has been completed.
metricsScore float MetricsScore is the metric comparison score (0.0-1.0), nil if not yet assessed.
alertDecayRetries integer AlertDecayRetries tracks the number of times the EM re-checked a firing alert
during decay monitoring. Incremented each reconcile where isAlertDecay returns true.
A non-zero value means the EM confirmed the resource was healthy but the alert
persisted, indicating Prometheus lookback window decay.

EAConfig

EAConfig contains assessment configuration set by the RO at creation time. StabilizationWindow controls how long the EM waits after remediation before starting assessment checks. HashComputeDelay and AlertCheckDelay are optional Duration-based delays that the RO computes based on target type and signal mode. All other assessment parameters (PrometheusEnabled, AlertManagerEnabled, ValidityWindow) are EM-internal configuration read from effectivenessmonitor.Config. The EM emits individual component audit events to DataStorage; the overall effectiveness score is computed by DataStorage on demand, not by the EM.

Appears in: - EffectivenessAssessmentSpec

Field Type Description
stabilizationWindow Duration StabilizationWindow is the duration to wait after remediation before assessment.
Set by the Remediation Orchestrator. The EM uses this to delay assessment
until the system stabilizes post-remediation.
hashComputeDelay Duration HashComputeDelay is the duration to defer post-remediation spec hash computation
after EA creation. Set by the RO for async-managed targets (GitOps, operator
CRDs) where spec changes propagate after the WorkflowExecution completes.
The EM computes the deferral deadline as: creation + HashComputeDelay.
Nil means compute immediately (sync workflows, backward compatible).
alertCheckDelay Duration AlertCheckDelay is an additional duration to defer alert resolution checks
beyond the StabilizationWindow. Set by the RO for proactive (predictive) alerts
where the underlying Prometheus alert (e.g. predict_linear) requires extra time
to resolve after remediation.
The EM computes AlertManagerCheckAfter as:
creation + StabilizationWindow + AlertCheckDelay
Nil means no additional delay (AlertManagerCheckAfter = PrometheusCheckAfter).

EffectivenessAssessment

EffectivenessAssessment is the Schema for the effectivenessassessments API. It is created by the Remediation Orchestrator and watched by the Effectiveness Monitor.

Field Type Description
apiVersion string kubernaut.ai/v1alpha1
kind string EffectivenessAssessment
metadata ObjectMeta Refer to the Kubernetes API documentation for fields of metadata.
spec EffectivenessAssessmentSpec
status EffectivenessAssessmentStatus

EffectivenessAssessmentSpec

EffectivenessAssessmentSpec defines the desired state of an EffectivenessAssessment.

The spec is set by the Remediation Orchestrator at creation time and is immutable. Immutability is enforced by CEL validation (self == oldSelf) to prevent tampering.

Appears in: - EffectivenessAssessment

Field Type Description
correlationID string CorrelationID is the name of the parent RemediationRequest.
Used as the correlation ID for audit events .
remediationRequestPhase string RemediationRequestPhase is the RemediationRequest's OverallPhase at the time
the EA was created. Captured as an immutable spec field so the EM can branch
assessment logic based on the RR outcome (Verifying, Completed, Failed, TimedOut).
Verifying: happy path — WFE succeeded, EA created while RR awaits assessment .
Previously stored as the mutable label kubernaut.ai/rr-phase; moved to spec
for immutability and security.
signalTarget TargetResource SignalTarget is the resource that triggered the alert.
Source: RR.Spec.TargetResource (from Gateway alert extraction).
Used by: health assessment, alert resolution, metrics queries .
remediationTarget TargetResource RemediationTarget is the resource the workflow modified.
Source: AA.Status.RootCauseAnalysis.RemediationTarget (from HAPI RCA resolution).
Used by: spec hash computation, drift detection .
config EAConfig Config contains the assessment configuration parameters.
remediationCreatedAt Time RemediationCreatedAt is the creation timestamp of the parent RemediationRequest.
Set by the RO at EA creation time from rr.CreationTimestamp.
Used by the audit manager to compute resolution_time_seconds in the
assessment.completed event (CompletedAt - RemediationCreatedAt).
signalName string SignalName is the original alert/signal name from the parent RemediationRequest.
Set by the RO at EA creation time from rr.Spec.SignalName.
Used by the audit manager to populate the signal_name field in assessment.completed
events (OBS-1: distinct from CorrelationID which is the RR name).
preRemediationSpecHash string PreRemediationSpecHash is the canonical spec hash of the target resource BEFORE
remediation was applied. Copied from rr.Status.PreRemediationSpecHash by the RO
at EA creation time. The EM uses this to compare pre vs post-remediation state
for spec drift detection, eliminating the need to query DataStorage audit events.

EffectivenessAssessmentStatus

EffectivenessAssessmentStatus defines the observed state of an EffectivenessAssessment.

Appears in: - EffectivenessAssessment

Field Type Description
phase string Phase is the current lifecycle phase of the assessment.
validityDeadline Time ValidityDeadline is the absolute time after which the assessment expires.
Computed by the EM controller on first reconciliation as:
EA.creationTimestamp + validityWindow (from EM config).
This follows Kubernetes spec/status convention: the RO sets desired state
(StabilizationWindow in spec), and the EM computes observed/derived state
(ValidityDeadline in status). This prevents misconfiguration where
StabilizationWindow > ValidityDeadline.
prometheusCheckAfter Time PrometheusCheckAfter is the earliest time to query Prometheus for metrics.
Computed by the EM controller on first reconciliation as:
EA.creationTimestamp + StabilizationWindow (from EA spec).
Stored in status to avoid recomputation on every reconcile and for
operator observability of the assessment timeline.
alertManagerCheckAfter Time AlertManagerCheckAfter is the earliest time to check AlertManager for alert resolution.
Computed by the EM controller on first reconciliation as:
EA.creationTimestamp + StabilizationWindow + AlertCheckDelay (if set).
When AlertCheckDelay is nil, equals PrometheusCheckAfter.
Stored in status to avoid recomputation on every reconcile and for
operator observability of the assessment timeline.
components EAComponents Components tracks the completion state of each assessment component.
assessmentReason string AssessmentReason describes why the assessment completed with this outcome.
completedAt Time CompletedAt is the timestamp when the assessment finished.
message string Message provides human-readable details about the current state.
conditions Condition array Conditions represent the latest available observations of the EA's state.

EnrichmentConfig

EnrichmentConfig specifies per-signal enrichment settings. V2.0 PLACEHOLDER: These fields are currently NOT read by the controller. All signals use the global enrichment config from the controller's YAML configuration (enrichment.cacheTtl, enrichment.timeout). Per-signal overrides will be implemented in V2.0.

Appears in: - SignalProcessingSpec

Field Type Description
enableClusterState boolean Enable cluster state enrichment
enableMetrics boolean Enable metrics enrichment
enableHistorical boolean Enable historical enrichment
timeout Duration Timeout for enrichment operations

Environment

Underlying type: string

Environment represents a canonical deployment environment. 4 canonical environments + Unknown fallback.

Appears in: - EnvironmentClassification

Validation: - Enum: [Production Staging Development Test Unknown]

Value Description
Production
Staging
Development
Test
Unknown

EnvironmentClassification

EnvironmentClassification from .

V2.0: Removed signal-labels source (security vulnerability)

Appears in: - SignalProcessingStatus

Field Type Description
environment Environment
source string Source of classification: namespace-labels, rego-inference, default
classifiedAt Time When classification was performed

ExecutionConfig

ExecutionConfig contains minimal execution settings. ServiceAccountName moved to Spec.ServiceAccountName (engine-agnostic).

Appears in: - WorkflowExecutionSpec

Field Type Description
timeout Duration Timeout for the entire workflow (Tekton PipelineRun timeout)
Default: use global timeout from RemediationRequest or 30m

ExecutionContext

ExecutionContext captures execution and retry data.

Appears in: - NotificationContext

Field Type Description
retryCount string RetryCount is the number of retries attempted.
maxRetries string MaxRetries is the maximum number of retries allowed.
lastExitCode string LastExitCode is the last exit code from the workflow execution.
previousExecution string PreviousExecution is the name of the previous WorkflowExecution.
timeoutPhase string TimeoutPhase is the phase that timed out.
phaseTimeout string PhaseTimeout is the duration string for the phase timeout.

ExecutionStatusSummary

ExecutionStatusSummary captures key execution resource status fields Lightweight summary for both Tekton PipelineRun and K8s Job backends

Appears in: - WorkflowExecutionStatus

Field Type Description
status ConditionStatus Status of the execution resource (Unknown, True, False)
reason string Reason from the execution resource (e.g., "Succeeded", "Failed", "Running")
message string Message from the execution resource
completedTasks integer CompletedTasks count
totalTasks integer TotalTasks count (from pipeline spec)

FailureDetails

FailureDetails contains structured failure classification information

Appears in: - WorkflowExecutionStatus

Field Type Description
failedTaskIndex integer FailedTaskIndex is 0-indexed position of failed task in pipeline
failedTaskName string FailedTaskName is the name of the failed Tekton Task
failedStepName string FailedStepName is the name of the failed step within the task (if available)
Tekton tasks can have multiple steps; this identifies the specific step
reason string Reason is a Kubernetes-style reason code
Used for deterministic failure classification by RO
message string Message is human-readable error message (for logging/UI/notifications)
exitCode integer ExitCode from container (if applicable)
Useful for script-based tasks that return specific exit codes
failedAt Time FailedAt is the timestamp when the failure occurred
executionTimeBeforeFailure Duration ExecutionTimeBeforeFailure is how long the workflow ran before failing
naturalLanguageSummary string NaturalLanguageSummary is a human/LLM-readable failure description
Generated by WE controller from structured data above
Used by:
- RO: Included in failure notifications
- Notification: Included in user-facing failure alerts
wasExecutionFailure boolean WasExecutionFailure indicates whether the failure occurred during workflow execution
true = workflow RAN and failed (non-idempotent actions may have occurred)
false = workflow failed BEFORE execution (validation, image pull, quota, etc.)
CRITICAL: Execution failures (true) block ALL future retries for this target
Pre-execution failures (false) get exponential backoff

FailurePhase

Underlying type: string

FailurePhase represents the orchestration phase where a failure occurred. PascalCase for CRD phase values.

Appears in: - RemediationRequestStatus

Validation: - Enum: [Configuration SignalProcessing AIAnalysis Approval WorkflowExecution Blocked]

Value Description
Configuration
SignalProcessing
AIAnalysis
Approval
WorkflowExecution
Blocked

InvestigationSession

InvestigationSession tracks the async HAPI session lifecycle. AA controller session tracking Session regeneration on 404 (HAPI restart)

Appears in: - AIAnalysisStatus

Field Type Description
id string Session ID returned by HAPI on submit (cleared on session loss)
generation integer Generation counter tracking session regenerations (0 = first session, incremented on 404)
lastPolled Time LastPolled timestamp of the last poll attempt
createdAt Time CreatedAt timestamp when the current session was created
pollCount integer PollCount tracks the number of poll attempts for observability
Constant 15s poll interval (configurable 1s–5m)

LineageContext

LineageContext tracks parent resource references for audit correlation .

Appears in: - NotificationContext

Field Type Description
remediationRequest string RemediationRequest is the name of the parent RemediationRequest.
aiAnalysis string AIAnalysis is the name of the parent AIAnalysis.

NotificationContext

NotificationContext provides structured context for a notification, replacing the former unstructured Metadata map[string]string.

Appears in: - NotificationRequestSpec

Field Type Description
lineage LineageContext Lineage tracks parent resource references for audit correlation.
workflow WorkflowContext Workflow captures selected workflow details (approval/completion notifications).
analysis AnalysisContext Analysis captures AI analysis results (approval/completion notifications).
review ReviewContext Review captures manual review context (manual-review notifications).
execution ExecutionContext Execution captures execution and retry context (manual-review WE source, timeout notifications).
dedup DedupContext Dedup captures deduplication context (bulk duplicate notifications).
target TargetContext Target captures target resource context (timeout notifications).
verification VerificationContext Verification captures EA verification results (completion notifications, #318).
Enables routing rules to match on verification outcome (e.g., inconclusive -> escalation).

NotificationPhase

Underlying type: string

Appears in: - NotificationRequestStatus

Validation: - Enum: [Pending Sending Retrying Sent PartiallySent Failed]

Value Description
Pending
Sending
Retrying
Sent
PartiallySent
Failed

NotificationPriority

Underlying type: string

Appears in: - NotificationRequestSpec

Validation: - Enum: [critical high medium low]

Value Description
critical
high
medium
low

NotificationRequest

NotificationRequest is the Schema for the notificationrequests API

Field Type Description
apiVersion string kubernaut.ai/v1alpha1
kind string NotificationRequest
metadata ObjectMeta Refer to the Kubernetes API documentation for fields of metadata.
spec NotificationRequestSpec
status NotificationRequestStatus

NotificationRequestSpec

NotificationRequestSpec defines the desired state of NotificationRequest

Spec Immutability ALL spec fields are immutable after CRD creation. Users cannot update notification content once created. To change a notification, delete and recreate the CRD.

Rationale: Notifications are immutable events, not mutable resources. This prevents race conditions, simplifies controller logic, and provides perfect audit trail.

Cancellation: Delete the NotificationRequest CRD to cancel delivery.

Appears in: - NotificationRequest

Field Type Description
remediationRequestRef ObjectReference Reference to parent RemediationRequest (if applicable)
Used for audit correlation and lineage tracking
Optional: NotificationRequest can be standalone (e.g., system-generated alerts)
type NotificationType Type of notification (escalation, simple, status-update)
priority NotificationPriority Priority of notification (critical, high, medium, low)
subject string Subject line for notification
body string Notification body content
severity string Severity from the originating signal (used for routing)
promoted from mutable label to immutable spec field
phase string Phase that triggered this notification (for phase-timeout notifications)
promoted from mutable label to immutable spec field
reviewSource ReviewSourceType ReviewSource indicates what triggered manual review (for manual-review notifications)
promoted from mutable label to immutable spec field
context NotificationContext Context provides typed, structured notification context replacing the
former unstructured Metadata map. Each sub-struct is optional (nil means
not applicable for this notification type).
extensions object (keys:string, values:string) Extensions holds arbitrary key-value pairs for routing and custom data
that don't fit the typed Context schema (e.g., test routing overrides,
vendor-specific tags). Routing rules can match on these keys.
actionLinks ActionLink array Action links to external services
retryPolicy RetryPolicy Retry policy for delivery
retentionDays integer Retention period in days after completion

NotificationRequestStatus

NotificationRequestStatus defines the observed state of NotificationRequest

Appears in: - NotificationRequest

Field Type Description
phase NotificationPhase Phase of notification lifecycle (Pending, Sending, Sent, PartiallySent, Failed)
conditions Condition array Conditions represent the latest available observations of the notification's state
deliveryAttempts DeliveryAttempt array List of all delivery attempts across all channels
totalAttempts integer Total number of delivery attempts across all channels
successfulDeliveries integer Number of successful deliveries
failedDeliveries integer Number of failed deliveries
queuedAt Time Time when notification was queued for processing
processingStartedAt Time Time when processing started
completionTime Time Time when all deliveries completed (success or failure)
observedGeneration integer Observed generation from spec
reason NotificationStatusReason Reason for current phase
message string Human-readable message about current state

NotificationStatusReason

Underlying type: string

Appears in: - NotificationRequestStatus

Value Description
AllDeliveriesSucceeded
PartialDeliverySuccess
AllDeliveriesFailed
NoChannelsResolved
PartialFailureRetrying
MaxRetriesExhausted

NotificationType

Underlying type: string

Appears in: - NotificationRequestSpec

Validation: - Enum: [escalation simple status-update approval manual-review completion]

Value Description
escalation
simple
status-update
approval NotificationTypeApproval is used for approval request notifications
Added Dec 2025 per RO team request for explicit approval workflow support
manual-review NotificationTypeManualReview is used for manual intervention required notifications
Added Dec 2025 for ExhaustedRetries/PreviousExecutionFailed scenarios requiring operator action
Distinct from 'escalation' to enable spec-field-based routing rules
completion NotificationTypeCompletion is used for successful remediation completion notifications
Created when WorkflowExecution completes successfully and RR transitions to Completed phase
Enables operators to track successful autonomous remediations

ObjectRef

ObjectRef is a lightweight reference to another object in the same namespace

Appears in: - RemediationApprovalRequestSpec

Field Type Description
name string Name of the referenced object

ObjectReference

ObjectReference contains enough information to let you locate the referenced object.

Appears in: - SignalProcessingSpec

Field Type Description
apiVersion string API version of the referent
kind string Kind of the referent
name string Name of the referent
namespace string Namespace of the referent
uid string UID of the referent

PolicyDecision

Underlying type: string

PolicyDecision represents the Rego policy evaluation outcome.

Appears in: - PolicyEvaluation

Validation: - Enum: [Approved ManualReviewRequired Denied DegradedMode]

Value Description
Approved
ManualReviewRequired
Denied
DegradedMode

PolicyEvaluation

PolicyEvaluation contains Rego policy evaluation results

Appears in: - ApprovalContext

Field Type Description
policyName string Policy name that was evaluated
matchedRules string array Rules that matched
decision PolicyDecision Decision from policy evaluation

PostRCAContext

PostRCAContext holds data computed by HAPI after the RCA phase. DetectedLabels are computed at runtime by HAPI's LabelDetector and returned in the HAPI response for storage in the AIAnalysis status. This data is used by Rego policies for approval gating (e.g., stateful workload detection) and is immutable once set.

Appears in: - AIAnalysisStatus

Field Type Description
detectedLabels DetectedLabels DetectedLabels contains cluster characteristics computed by HAPI's
LabelDetector during get_namespaced_resource_context or get_cluster_resource_context tool invocations.
setAt Time SetAt records when the PostRCAContext was populated.
Used as the immutability guard: once SetAt is non-nil, the entire
PostRCAContext becomes immutable via CEL validation.

Priority

Underlying type: string

Priority represents an operational priority level.

Appears in: - PriorityAssignment

Validation: - Enum: [P0 P1 P2 P3]

Value Description
P0
P1
P2
P3

PriorityAssignment

PriorityAssignment from .

Appears in: - SignalProcessingStatus

Field Type Description
priority Priority
source string Source of assignment: rego-policy, severity-fallback, default
policyName string Which Rego rule matched (if applicable)
assignedAt Time When assignment was performed

RecommendedAction

RecommendedAction describes a remediation action with rationale

Appears in: - ApprovalContext

Field Type Description
workflowId string WorkflowId is the catalog workflow identifier for this recommendation
rationale string Rationale explaining why this action is recommended

RecommendedWorkflowSummary

RecommendedWorkflowSummary contains a summary of the recommended workflow

Appears in: - RemediationApprovalRequestSpec

Field Type Description
workflowId string Workflow identifier from catalog
version string Workflow version
executionBundle string Execution bundle OCI reference (digest-pinned)
rationale string Rationale for selecting this workflow

RemediationApprovalRequest

RemediationApprovalRequest is the Schema for the remediationapprovalrequests API.

RemediationApprovalRequest CRD Architecture - Follows Kubernetes CertificateSigningRequest pattern (immutable spec, mutable status) - Owned by RemediationRequest - AIAnalysis controller uses field index on spec.aiAnalysisRef.name for efficient lookup - Timeout expiration handled by dedicated controller

Lifecycle: 1. RO creates when AIAnalysis.status.approvalRequired=true 2. Operator approves/rejects via status.conditions update 3. Dedicated controller detects decision or timeout 4. AIAnalysis controller watches and transitions phase accordingly

Field Type Description
apiVersion string kubernaut.ai/v1alpha1
kind string RemediationApprovalRequest
metadata ObjectMeta Refer to the Kubernetes API documentation for fields of metadata.
spec RemediationApprovalRequestSpec
status RemediationApprovalRequestStatus

RemediationApprovalRequestSpec

RemediationApprovalRequestSpec defines the desired state of RemediationApprovalRequest.

Spec Immutability ALL spec fields are immutable after CRD creation (follows CertificateSigningRequest pattern). This provides a complete audit trail and prevents race conditions.

Appears in: - RemediationApprovalRequest

Field Type Description
remediationRequestRef ObjectReference Reference to parent RemediationRequest CRD (owner)
RemediationRequest owns this CRD via ownerReferences
aiAnalysisRef ObjectRef Reference to the AIAnalysis that requires approval
Used by AIAnalysis controller for efficient field-indexed lookup
confidence float Confidence score from AI analysis (0.0-1.0)
Typically 0.6-0.79 triggers approval (below auto-approve threshold)
confidenceLevel string Confidence level derived from score
reason string Reason why approval is required
recommendedWorkflow RecommendedWorkflowSummary Recommended workflow from AI analysis
investigationSummary string Investigation summary from HolmesGPT
evidenceCollected string array Evidence collected during investigation
recommendedActions ApprovalRecommendedAction array Recommended actions with rationale
alternativesConsidered ApprovalAlternative array Alternative approaches considered
whyApprovalRequired string Detailed explanation of why approval is required
policyEvaluation ApprovalPolicyEvaluation Policy evaluation results if Rego policy triggered approval
requiredBy Time Deadline for approval decision (approval expires after this time)
Calculated by RO using hierarchy: per-request → policy → namespace → default (15m)

RemediationApprovalRequestStatus

RemediationApprovalRequestStatus defines the observed state of RemediationApprovalRequest.

Appears in: - RemediationApprovalRequest

Field Type Description
decision ApprovalDecision Decision made by operator or system (timeout)
Empty string indicates pending decision
decidedBy string Who made the decision (username or "system" for timeout)
decidedAt Time When the decision was made
decisionMessage string Optional message from the decision maker
conditions Condition array Conditions represent the latest available observations
Standard condition types:
- "Approved" - Decision is Approved
- "Rejected" - Decision is Rejected
- "Expired" - Decision timed out
createdAt Time Time when the approval request was created
timeRemaining string Time remaining until expiration (human-readable, e.g., "5m30s")
Updated by controller periodically
expired boolean True if the approval request has expired
observedGeneration integer ObservedGeneration is the most recent generation observed
reason string Reason for current state (machine-readable)
message string Human-readable message about current state

RemediationPhase

Underlying type: string

RemediationPhase represents the orchestration phase of a RemediationRequest. These constants are exported for external consumers (e.g., Gateway) to enable type-safe cross-service integration .

Capitalized phase values per Kubernetes API conventions.

Appears in: - RemediationRequestStatus

Validation: - Enum: [Pending Processing Analyzing AwaitingApproval Executing Verifying Blocked Completed Failed TimedOut Skipped Cancelled]

Value Description
Pending PhasePending is the initial state when RemediationRequest is created.
Processing PhaseProcessing indicates SignalProcessing is enriching the signal.
Analyzing PhaseAnalyzing indicates AIAnalysis is determining remediation workflow.
AwaitingApproval PhaseAwaitingApproval indicates human approval is required.
Executing PhaseExecuting indicates WorkflowExecution is running remediation.
Verifying PhaseVerifying indicates remediation succeeded and EffectivenessAssessment is running.
Non-terminal: Gateway deduplicates signals while EA assesses remediation effectiveness.
RO transitions to Completed when EA reaches a terminal state or VerificationDeadline expires.
Blocked PhaseBlocked indicates remediation cannot proceed due to external blocking condition.
This is a NON-terminal phase (Gateway deduplicates, prevents RR flood).
V1.0: Unified blocking for 6 scenarios:
- ConsecutiveFailures: After cooldown → Failed
- ResourceBusy: When resource available → Proceeds to execute
- RecentlyRemediated: After cooldown → Proceeds to execute
- ExponentialBackoff: After backoff window → Retries execution
- DuplicateInProgress: When original completes → Inherits outcome
- UnmanagedResource: Retries until scope label added or RR times out
Completed PhaseCompleted is the terminal success state.
Failed PhaseFailed is the terminal failure state.
TimedOut PhaseTimedOut is the terminal timeout state.
Skipped PhaseSkipped is the terminal state when remediation was not needed.
Cancelled PhaseCancelled is the terminal state when remediation was manually cancelled.
Gateway treats this as terminal (allows new RR creation for retry)

RemediationRequest

RemediationRequest is the Schema for the remediationrequests API. Printer columns for operational triage

Field Type Description
apiVersion string kubernaut.ai/v1alpha1
kind string RemediationRequest
metadata ObjectMeta Refer to the Kubernetes API documentation for fields of metadata.
spec RemediationRequestSpec
status RemediationRequestStatus

RemediationRequestSpec

RemediationRequestSpec defines the desired state of RemediationRequest.

Spec Immutability RemediationRequest represents an immutable event (signal received, remediation required). Once created (by Gateway or external source), spec cannot be modified to ensure: - Audit trail integrity (remediation matches original signal) - No signal metadata tampering during remediation lifecycle - Consistent signal data across all child CRDs (SignalProcessing, AIAnalysis, WorkflowExecution)

Cancellation: Delete the RemediationRequest CRD (Kubernetes-native pattern). Status updates: Controllers update .status fields (not affected by spec immutability).

Note: Individual field immutability (e.g., signalFingerprint) is redundant with full spec immutability, but retained for explicit documentation of critical fields.

Appears in: - RemediationRequest

Field Type Description
signalFingerprint string Core Signal Identification
Unique fingerprint for deduplication (SHA256 of alert/event key fields)
This field is immutable and used for querying all occurrences of the same problem
signalName string Human-readable signal name (e.g., "HighMemoryUsage", "CrashLoopBackOff")
severity string Signal Classification
Severity level (external value from signal provider)
Examples: "Sev1", "P0", "critical", "HIGH", "warning"
SignalProcessing will normalize via Rego policy
signalType string Signal type: "alert" (generic signal type; adapter-specific values are deprecated)
Used for signal-aware remediation strategies
signalSource string Adapter that ingested the signal (e.g., "prometheus-adapter", "k8s-event-adapter")
targetType string Target system type: "kubernetes", "aws", "azure", "gcp", "datadog"
Indicates which infrastructure system the signal targets
targetResource ResourceIdentifier TargetResource identifies the Kubernetes resource that triggered this signal.
Populated by Gateway from NormalizedSignal.Resource - REQUIRED.
Used by SignalProcessing for context enrichment and RO for workflow routing.
For Kubernetes signals, this contains Kind, Name, Namespace of the affected resource.
firingTime Time Temporal Data
When the signal first started firing (from upstream source)
receivedTime Time When Gateway received the signal
signalLabels object (keys:string, values:string) Signal labels and annotations extracted from provider-specific data
These are populated by Gateway Service after parsing providerData
signalAnnotations object (keys:string, values:string)
providerData string Provider-specific fields in raw JSON format
Gateway adapter populates this based on signal source
Controllers parse this based on targetType/signalType
For Kubernetes (targetType="kubernetes"):
{"namespace": "...", "resource": {"kind": "...", "name": "..."}, "alertmanagerURL": "...", ...}
For AWS (targetType="aws"):
{"region": "...", "accountId": "...", "instanceId": "...", "resourceType": "...", ...}
For Datadog (targetType="datadog"):
{"monitorId": 123, "host": "...", "tags": [...], "metricQuery": "...", ...}
originalPayload string Complete original webhook payload for debugging and audit
stored as string to avoid base64 encoding in CEL validation

RemediationRequestStatus

RemediationRequestStatus defines the observed state of RemediationRequest.

Appears in: - RemediationRequest

Field Type Description
deduplication DeduplicationStatus Deduplication tracks signal occurrence for this remediation.
OWNER: Gateway Service (exclusive write access)
observedGeneration integer ObservedGeneration is the most recent generation observed by the controller.
Used to prevent duplicate reconciliations and ensure idempotency.
Per Standard pattern for all Kubernetes controllers.
overallPhase RemediationPhase Phase tracking for orchestration.
Uses typed RemediationPhase constants for type safety and cross-service integration.
Capitalized phase values per Kubernetes API conventions.
message string Human-readable message describing current status
startTime Time Timestamps
completedAt Time
processingStartTime Time ProcessingStartTime is when SignalProcessing phase started.
Used for per-phase timeout detection (default: 5 minutes).
analyzingStartTime Time AnalyzingStartTime is when AIAnalysis phase started.
Used for per-phase timeout detection (default: 10 minutes).
executingStartTime Time ExecutingStartTime is when WorkflowExecution phase started.
Used for per-phase timeout detection (default: 30 minutes).
verificationDeadline Time VerificationDeadline is the deadline for the Verifying phase.
Computed by RO as EA.Status.ValidityDeadline + 30s buffer.
If exceeded, RR transitions to Completed with Outcome "VerificationTimedOut".
signalProcessingRef ObjectReference References to downstream CRDs
remediationProcessingRef ObjectReference
aiAnalysisRef ObjectReference
workflowExecutionRef ObjectReference
notificationRequestRefs ObjectReference array NotificationRequestRefs tracks all notification CRDs created for this remediation.
Provides audit trail for compliance and instant visibility for debugging.
effectivenessAssessmentRef ObjectReference EffectivenessAssessmentRef tracks the EffectivenessAssessment CRD created for this remediation.
Set by the RO after creating the EA CRD on terminal phase transitions.
preRemediationSpecHash string PreRemediationSpecHash is the canonical spec hash of the target resource captured
by the RO BEFORE launching the remediation workflow. This enables the EM to compare
pre vs post-remediation state without querying DataStorage audit events.
Set once by the RO during the transition to WorkflowExecution phase; immutable after.
approvalNotificationSent boolean Approval notification tracking
Prevents duplicate notifications when AIAnalysis requires approval
skipReason SkipReason SkipReason indicates why this remediation was skipped.
Only set when OverallPhase = Skipped or Failed.
skipMessage string SkipMessage provides human-readable details about why remediation was skipped
Examples:
- "Same workflow executed recently. Cooldown: 3m15s remaining"
- "Another workflow is running on target: wfe-abc123"
- "Backoff active. Next allowed: 2025-12-15T10:30:00Z"
Only set when OverallPhase = "Skipped" or "Failed"
blockingWorkflowExecution string BlockingWorkflowExecution references the WorkflowExecution causing the block
Set for block reasons: ResourceBusy, RecentlyRemediated, ExponentialBackoff
Nil for: ConsecutiveFailures, DuplicateInProgress
Enables operators to investigate the blocking WFE for troubleshooting
duplicateOf string DuplicateOf references the parent RemediationRequest that this is a duplicate of
V1.0: Set when OverallPhase = "Blocked" with BlockReason = "DuplicateInProgress"
Old behavior: Set when OverallPhase = "Skipped" due to resource lock deduplication
duplicateCount integer DuplicateCount tracks the number of duplicate remediations that were skipped
because this RR's workflow was already executing (resource lock)
Only populated on parent RRs that have duplicates
duplicateRefs string array DuplicateRefs lists the names of RemediationRequests that were skipped
because they targeted the same resource as this RR
Only populated on parent RRs that have duplicates
blockReason BlockReason BlockReason indicates why this remediation is blocked (non-terminal)
Valid values:
- "ConsecutiveFailures": Max consecutive failures reached, in cooldown
- "ResourceBusy": Another workflow is using the target resource
- "RecentlyRemediated": Target recently remediated, cooldown active
- "ExponentialBackoff": Pre-execution failures, backoff window active
- "DuplicateInProgress": Duplicate of an active remediation
Only set when OverallPhase = "Blocked"
blockMessage string BlockMessage provides human-readable details about why remediation is blocked
Examples:
- "Another workflow is running on target deployment/my-app: wfe-abc123"
- "Recently remediated. Cooldown: 3m15s remaining"
- "Backoff active. Next retry: 2025-12-15T10:30:00Z"
- "Duplicate of active remediation rr-original-abc123"
- "3 consecutive failures. Cooldown expires: 2025-12-15T11:00:00Z"
Only set when OverallPhase = "Blocked"
blockedUntil Time BlockedUntil indicates when blocking expires (time-based blocks)
Set for: ConsecutiveFailures, RecentlyRemediated, ExponentialBackoff
Nil for: ResourceBusy, DuplicateInProgress (event-based, cleared when condition resolves)
After this time passes, RR will retry or transition to Failed (for ConsecutiveFailures)
nextAllowedExecution Time NextAllowedExecution indicates when this RR can be retried after exponential backoff.
Set when RR fails due to pre-execution failures (infrastructure, validation, etc.).
Implements progressive delay: 1m, 2m, 4m, 8m, capped at 10m.
Formula: min(Base × 2^(failures-1), Max)
Nil means no exponential backoff is active.
consecutiveFailureCount integer ConsecutiveFailureCount tracks how many times this fingerprint has failed consecutively.
Updated by RO when RR transitions to Failed phase.
Reset to 0 when RR completes successfully.
failurePhase FailurePhase FailurePhase indicates which orchestration phase failed.
Only set when OverallPhase = Failed.
failureReason string FailureReason provides a human-readable reason for the failure
Only set when OverallPhase = "failed"
requiresManualReview boolean RequiresManualReview indicates that this remediation cannot proceed automatically
and requires operator intervention. Set when:
- WE skip reason is "ExhaustedRetries" (5+ consecutive pre-execution failures)
- WE skip reason is "PreviousExecutionFailed" (execution failure, cluster state unknown)
- AIAnalysis WorkflowResolutionFailed with LowConfidence or WorkflowNotFound
outcome string Outcome indicates the remediation result when completed.
Values:
- "Remediated": Workflow executed successfully
- "NoActionRequired": AIAnalysis determined no action needed (problem self-resolved)
- "ManualReviewRequired": Requires operator intervention
- "VerificationTimedOut": EA assessment did not complete within deadline
timeoutPhase RemediationPhase TimeoutPhase indicates which orchestration phase timed out.
Only set when OverallPhase = TimedOut.
timeoutTime Time TimeoutTime records when the timeout occurred
Only set when OverallPhase = "timeout"
retentionExpiryTime Time RetentionExpiryTime indicates when this CRD should be cleaned up (24 hours after completion)
notificationStatus string NotificationStatus tracks the delivery status of notification(s) for this remediation.
Values: "Pending", "InProgress", "Sent", "Failed", "Cancelled"
Status Mapping from NotificationRequest.Status.Phase:
- NotificationRequest Pending → "Pending"
- NotificationRequest Sending → "InProgress"
- NotificationRequest Sent → "Sent"
- NotificationRequest Failed → "Failed"
- NotificationRequest deleted by user → "Cancelled"
For bulk notifications , this reflects the status of the consolidated notification.
conditions Condition array Conditions represent observations of RemediationRequest state.
Standard condition types:
- "NotificationDelivered": True if notification sent successfully, False if cancelled/failed
- Reason "DeliverySucceeded": Notification sent
- Reason "UserCancelled": User deleted NotificationRequest before delivery
- Reason "DeliveryFailed": NotificationRequest failed to deliver
Conditions follow Kubernetes API conventions (KEP-1623).
timeoutConfig TimeoutConfig TimeoutConfig provides operational timeout overrides for this remediation.
OWNER: Remediation Orchestrator (sets defaults on first reconcile)
MUTABLE BY: Operators (can adjust mid-remediation via kubectl edit)
lastModifiedBy string LastModifiedBy tracks the last operator who modified this RR's status.
Populated by RemediationRequest mutating webhook.
lastModifiedAt Time LastModifiedAt tracks when the last status modification occurred.
Populated by RemediationRequest mutating webhook.
currentProcessingRef ObjectReference CurrentProcessingRef references the current SignalProcessing CRD
selectedWorkflowRef WorkflowReference SelectedWorkflowRef captures the workflow selected by AI for this remediation.
Populated from workflowexecution.selection.completed audit event.
executionRef ObjectReference ExecutionRef references the WorkflowExecution CRD for this remediation.
Populated from workflowexecution.execution.started audit event.
remediationTarget ResourceIdentifier RemediationTarget identifies the Kubernetes resource the LLM determined should be
remediated. Populated from AIAnalysis.Status.RootCauseAnalysis.AffectedResource.
May differ from Spec.TargetResource (e.g., Deployment vs Pod).
targetDisplay string TargetDisplay is the Kubernetes-idiomatic Kind/Name of the RCA target
(e.g., "Deployment/web-frontend"). Populated when RemediationTarget is set.
confidence string Confidence is the AI analysis confidence score as a display string
(e.g., "0.97"). Populated from AIAnalysis.SelectedWorkflow.Confidence.
workflowDisplayName string WorkflowDisplayName is the human-readable workflow identifier
(e.g., "GitRevertCommit:git-revert-v2"). Populated from AIAnalysis.SelectedWorkflow.
signalTargetDisplay string SignalTargetDisplay is the Kubernetes-idiomatic Kind/Name of the signal target
(e.g., "Pod/web-frontend-cdbdbc4f8-6kn6j"). Populated from Spec.TargetResource.

RemediationTarget

RemediationTarget identifies the Kubernetes resource identified by the LLM as the actual target for remediation. This may differ from the signal's source resource (e.g., the signal comes from a Pod, but the Deployment should be patched).

Appears in: - RootCauseAnalysis

Field Type Description
kind string Kind is the Kubernetes resource kind (e.g., "Deployment", "StatefulSet", "DaemonSet")
name string Name is the resource name
namespace string Namespace is the resource namespace. Empty for cluster-scoped resources (e.g., Node, PersistentVolume).

RemediationWorkflow

RemediationWorkflow is the Schema for the remediationworkflows API. Kubernetes-native workflow schema definition.

Field Type Description
apiVersion string kubernaut.ai/v1alpha1
kind string RemediationWorkflow
metadata ObjectMeta Refer to the Kubernetes API documentation for fields of metadata.
spec RemediationWorkflowSpec
status RemediationWorkflowStatus

RemediationWorkflowDependencies

RemediationWorkflowDependencies declares infrastructure resources

Appears in: - RemediationWorkflowSpec

Field Type Description
secrets RemediationWorkflowResourceDependency array
configMaps RemediationWorkflowResourceDependency array

RemediationWorkflowDescription

RemediationWorkflowDescription provides structured information about a workflow

Appears in: - RemediationWorkflowSpec

Field Type Description
what string What describes what this workflow concretely does
whenToUse string WhenToUse describes conditions under which this workflow is appropriate
whenNotToUse string WhenNotToUse describes specific exclusion conditions
preconditions string Preconditions describes conditions that must be verified through investigation

RemediationWorkflowExecution

RemediationWorkflowExecution contains execution engine configuration

Appears in: - RemediationWorkflowSpec

Field Type Description
engine string Engine is the execution engine type
bundle string Bundle is the execution bundle or container image reference
bundleDigest string BundleDigest is the digest of the execution bundle
engineConfig JSON EngineConfig holds engine-specific configuration
serviceAccountName string ServiceAccountName is the pre-existing ServiceAccount for the execution
resource (Job, PipelineRun, or Ansible TokenRequest).
Operators pre-create SAs with appropriate RBAC in the
execution namespace. If absent, K8s assigns the namespace's default SA
(Job/Tekton) or the Ansible executor uses the controller's in-cluster
credentials (#500 fallback).

RemediationWorkflowLabels

RemediationWorkflowLabels contains mandatory matching/filtering criteria

Appears in: - RemediationWorkflowSpec

Field Type Description
severity string array Severity is the severity level(s)
environment string array Environment is the target environment(s)
component string Component is the Kubernetes resource type
priority string Priority is the business priority level

RemediationWorkflowMaintainer

RemediationWorkflowMaintainer contains maintainer contact information

Appears in: - RemediationWorkflowSpec

Field Type Description
name string
email string

RemediationWorkflowParameter

RemediationWorkflowParameter defines a workflow input parameter

Appears in: - RemediationWorkflowSpec

Field Type Description
name string
type string
required boolean
description string
enum string array
pattern string
minimum float
maximum float
default JSON
dependsOn string array

RemediationWorkflowResourceDependency

RemediationWorkflowResourceDependency identifies a Kubernetes resource by name

Appears in: - RemediationWorkflowDependencies

Field Type Description
name string

RemediationWorkflowSpec

RemediationWorkflowSpec defines the desired state of RemediationWorkflow. Maps to the spec content of a workflow-schema.yaml file per . Workflow name is derived from the CRD's metadata.name (not duplicated in spec).

Appears in: - RemediationWorkflow

Field Type Description
version string Version is the semantic version (e.g., "1.0.0")
description RemediationWorkflowDescription Description is a structured description for LLM and operator consumption
actionType string ActionType is the action type from the taxonomy (PascalCase).
labels RemediationWorkflowLabels Labels contains mandatory matching/filtering criteria for discovery
customLabels object (keys:string, values:string) CustomLabels contains operator-defined key-value labels for additional filtering
detectedLabels JSON DetectedLabels contains author-declared infrastructure requirements
execution RemediationWorkflowExecution Execution contains execution engine configuration
dependencies RemediationWorkflowDependencies Dependencies declares infrastructure resources required by the workflow
maintainers RemediationWorkflowMaintainer array Maintainers is optional maintainer information
parameters RemediationWorkflowParameter array Parameters defines the workflow input parameters
rollbackParameters RemediationWorkflowParameter array RollbackParameters defines parameters needed for rollback

RemediationWorkflowStatus

RemediationWorkflowStatus defines the observed state of RemediationWorkflow

Appears in: - RemediationWorkflow

Field Type Description
workflowId string WorkflowID is the UUID assigned by Data Storage upon registration
catalogStatus CatalogStatus CatalogStatus reflects the DS catalog lifecycle state.
registeredBy string RegisteredBy is the identity of the registrant
registeredAt Time RegisteredAt is the timestamp of initial registration
previouslyExisted boolean PreviouslyExisted indicates if this workflow was re-registered after deletion

ResourceIdentifier

ResourceIdentifier identifies the target resource for remediation.

Appears in: - SignalData

Field Type Description
kind string Resource kind (e.g., "Pod", "Deployment", "StatefulSet")
name string Resource name
namespace string Resource namespace. Empty for cluster-scoped resources (e.g., Node, PersistentVolume).

RetryPolicy

RetryPolicy defines retry behavior for notification delivery

Appears in: - NotificationRequestSpec

Field Type Description
maxAttempts integer Maximum number of delivery attempts
initialBackoffSeconds integer Initial backoff duration in seconds
backoffMultiplier integer Backoff multiplier (exponential backoff)
maxBackoffSeconds integer Maximum backoff duration in seconds

ReviewContext

ReviewContext captures manual review details .

Appears in: - NotificationContext

Field Type Description
reason string Reason is the high-level failure reason (e.g., "WorkflowResolutionFailed").
subReason string SubReason provides granular detail (e.g., "WorkflowNotFound").
humanReviewReason string HumanReviewReason from HAPI when needs_human_review=true .
rootCauseAnalysis string RootCauseAnalysis from AIAnalysis if available.

ReviewSourceType

Underlying type: string

Appears in: - NotificationRequestSpec

Validation: - Enum: [AIAnalysis WorkflowExecution]

Value Description
AIAnalysis
WorkflowExecution

RootCauseAnalysis

RootCauseAnalysis contains detailed RCA results

Appears in: - AIAnalysisStatus

Field Type Description
summary string Brief summary of root cause
severity string Severity determined by RCA
Aligned with HAPI/workflow catalog (critical, high, medium, low, unknown)
signalType string Signal type determined by RCA (may differ from input)
contributingFactors string array Contributing factors
remediationTarget RemediationTarget RemediationTarget identifies the actual resource the LLM determined should be remediated.
The LLM may identify a higher-level resource (e.g., Deployment) rather than
the Pod that generated the signal. The WFE creator should prefer this over the RR's
TargetResource when available to ensure the correct resource is patched.

SelectedWorkflow

SelectedWorkflow contains the AI-selected workflow for execution Output format for RO to create WorkflowExecution

Appears in: - AIAnalysisStatus

Field Type Description
workflowId string Workflow identifier (catalog lookup key)
actionType string Action type from taxonomy (e.g., ScaleReplicas, RestartPod).
Propagated from HAPI three-step discovery protocol to RO audit events.
version string Workflow version
executionBundle string Execution bundle OCI reference (digest-pinned) - resolved by HolmesGPT-API
executionBundleDigest string Execution bundle digest for audit trail
confidence float Confidence score (0.0-1.0)
parameters object (keys:string, values:string) Workflow parameters (UPPER_SNAKE_CASE keys)
rationale string Rationale explaining why this workflow was selected
executionEngine string ExecutionEngine specifies the backend engine for workflow execution.
Populated from HolmesGPT-API workflow recommendation.
When empty, defaults to "tekton" for backwards compatibility.
engineConfig JSON EngineConfig holds engine-specific configuration .
For ansible: {"playbookPath": "...", "jobTemplateName": "...", "inventoryName": "..."}.
serviceAccountName string ServiceAccountName is the pre-existing ServiceAccount for the execution
resource (Job, PipelineRun, or Ansible TokenRequest).
Operators pre-create SAs with appropriate RBAC in the
execution namespace. If absent, K8s assigns the namespace's default SA
(Job/Tekton) or the Ansible executor uses the controller's in-cluster
credentials (#500 fallback).

SignalContextInput

SignalContextInput contains enriched signal context from SignalProcessing Structured types replace map[string]string anti-pattern

Appears in: - AnalysisRequest

Field Type Description
fingerprint string Signal fingerprint for correlation
severity string Signal severity: critical, high, medium, low, unknown (normalized by SignalProcessing Rego - )
signalName string Signal name (e.g., OOMKilled, CrashLoopBackOff)
Normalized by SignalProcessing: proactive names mapped to base names
signalMode string SignalMode indicates whether this is a reactive or proactive signal.
Proactive Signal Mode Prompt Strategy
Copied from SignalProcessing status by RemediationOrchestrator.
Used by HAPI to switch investigation prompt (RCA vs. predict & prevent).
environment string Environment classification
Examples: "production", "staging", "development", "qa-eu", "canary"
businessPriority string Business priority
Best practice examples: P0 (critical), P1 (high), P2 (normal), P3 (low)
targetResource TargetResource Target resource identification
enrichmentResults EnrichmentResults Complete enrichment results from SignalProcessing

SignalData

SignalData contains all signal information copied from RemediationRequest. This makes SignalProcessing self-contained for processing.

Appears in: - SignalProcessingSpec

Field Type Description
fingerprint string Unique fingerprint for deduplication (SHA256 of signal key fields)
name string Human-readable signal name (e.g., "HighMemoryUsage", "CrashLoopBackOff")
severity string Severity level (external/raw value from monitoring system)
No enum restriction - allows external severity schemes (Sev1-4, P0-P4, etc.)
Normalized severity is stored in Status.Severity
type string Signal type: "alert" (generic signal type; adapter-specific values like "prometheus-alert" or "kubernetes-event" are deprecated)
source string Adapter that ingested the signal
targetType string Target system type.
V2.0 PLACEHOLDER: Currently only "kubernetes" is supported by the enricher.
Non-kubernetes values are accepted by validation but enrichment will run in degraded mode.
targetResource ResourceIdentifier Target resource identification
labels object (keys:string, values:string) Signal labels extracted from provider-specific data
annotations object (keys:string, values:string) Signal annotations extracted from provider-specific data
firingTime Time When the signal first started firing
receivedTime Time When Gateway received the signal
providerData string Provider-specific fields in raw JSON format

SignalProcessing

SignalProcessing is the Schema for the signalprocessings API.

Field Type Description
apiVersion string kubernaut.ai/v1alpha1
kind string SignalProcessing
metadata ObjectMeta Refer to the Kubernetes API documentation for fields of metadata.
spec SignalProcessingSpec
status SignalProcessingStatus

SignalProcessingPhase

Underlying type: string

SignalProcessingPhase represents the current phase of SignalProcessing reconciliation. Phase State Machine Capitalized phase values per Kubernetes API conventions

Appears in: - SignalProcessingStatus

Validation: - Enum: [Pending Enriching Classifying Categorizing Completed Failed]

Value Description
Pending PhasePending is the initial state when SignalProcessing is created.
Enriching PhaseEnriching is when K8s context enrichment is in progress.
Classifying PhaseClassifying is when environment/priority classification is in progress.
Categorizing PhaseCategorizing is when business categorization is in progress.
Completed PhaseCompleted is the terminal success state.
Failed PhaseFailed is the terminal error state.

SignalProcessingSpec

SignalProcessingSpec defines the desired state of SignalProcessing.

Spec Immutability SignalProcessing represents an immutable event (signal enrichment). Once created by RemediationOrchestrator, spec cannot be modified to ensure: - Audit trail integrity (processed signal matches original signal) - No signal data tampering during enrichment - Consistent context passed to AIAnalysis

To reprocess a signal, delete and recreate the SignalProcessing CRD.

Appears in: - SignalProcessing

Field Type Description
remediationRequestRef ObjectReference Reference to parent RemediationRequest
signal SignalData Signal data (copied from RemediationRequest for processing)
enrichmentConfig EnrichmentConfig Configuration for processing

SignalProcessingStatus

SignalProcessingStatus defines the observed state of SignalProcessing.

Appears in: - SignalProcessing

Field Type Description
observedGeneration integer ObservedGeneration is the most recent generation observed by the controller.
Used to prevent duplicate reconciliations and ensure idempotency.
Per Standard pattern for all Kubernetes controllers.
phase SignalProcessingPhase Phase: Pending, Enriching, Classifying, Categorizing, Completed, Failed
startTime Time Processing timestamps
completionTime Time
kubernetesContext KubernetesContext Enrichment results
environmentClassification EnvironmentClassification Categorization results
priorityAssignment PriorityAssignment
businessClassification BusinessClassification
severity string Severity determination
Normalized severity determined by Rego policy: "critical", "high", "medium", "low", or "unknown"
Aligned with HAPI/workflow catalog severity levels for consistency across platform
Enables downstream services (AIAnalysis, RemediationOrchestrator, Notification)
to interpret alert urgency without understanding external severity schemes.
policyHash string PolicyHash is the SHA256 hash of the Rego policy used for severity determination
Provides audit trail and policy version tracking for compliance requirements
Expected format: 64-character hexadecimal string (SHA256 hash)
signalMode string SignalMode indicates whether this is a reactive or proactive signal.
Proactive Signal Mode Classification
Proactive Signal Mode Classification and Prompt Strategy
Set during the Classifying phase alongside severity, environment, and priority.
All signals MUST be classified — "reactive" is the default for unmapped types.
signalName string SignalName is the normalized signal name after proactive-to-base mapping.
Signal Name Normalization
For proactive signals (e.g., "PredictedOOMKill"), this is the base name (e.g., "OOMKilled").
For reactive signals, this matches Spec.Signal.Name unchanged.
This is the AUTHORITATIVE signal name for all downstream consumers (RO, AA, HAPI).
sourceSignalName string SourceSignalName preserves the pre-normalization signal name for audit trail.
Audit trail preservation (SOC2 CC7.4)
Only populated for proactive signals (e.g., "PredictedOOMKill").
Empty for reactive signals.
conditions Condition array Conditions for detailed status
error string Error information
consecutiveFailures integer ConsecutiveFailures tracks the number of consecutive transient failures.
Used with shared backoff for exponential retry delays .
Reset to 0 on successful phase transition.
lastFailureTime Time LastFailureTime records when the last failure occurred.
Used to determine if enough time has passed for retry.

SkipReason

Underlying type: string

SkipReason represents the reason why a RemediationRequest was skipped.

Appears in: - RemediationRequestStatus

Validation: - Enum: [RecentlyRemediated ResourceBusy ExhaustedRetries PreviousExecutionFailed]

Value Description
RecentlyRemediated
ResourceBusy
ExhaustedRetries
PreviousExecutionFailed

TargetContext

TargetContext captures target resource context.

Appears in: - NotificationContext

Field Type Description
targetResource string TargetResource in "Kind/Name" format.

TargetResource

TargetResource identifies a Kubernetes resource by kind, name, and namespace.

Appears in: - EffectivenessAssessmentSpec

Field Type Description
kind string Kind is the Kubernetes resource kind (e.g., "Deployment", "StatefulSet").
name string Name is the resource name.
namespace string Namespace is the resource namespace.
Empty for cluster-scoped resources (e.g., Node, PersistentVolume).

TimeoutConfig

TimeoutConfig provides fine-grained timeout configuration for remediations. Supports both global workflow timeout and per-phase timeouts for granular control.

Appears in: - RemediationRequestStatus

Field Type Description
global Duration Global timeout for entire remediation workflow.
Overrides controller-level default (1 hour).
processing Duration Processing phase timeout (SignalProcessing enrichment).
Overrides controller-level default (5 minutes).
analyzing Duration Analyzing phase timeout (AIAnalysis investigation).
Overrides controller-level default (10 minutes).
executing Duration Executing phase timeout (WorkflowExecution remediation).
Overrides controller-level default (30 minutes).

ValidationAttempt

ValidationAttempt contains details of a single HAPI validation attempt Per HAPI retries up to 3 times with LLM self-correction Each attempt feeds validation errors back to the LLM for correction

Appears in: - AIAnalysisStatus

Field Type Description
attempt integer Attempt number (1, 2, or 3)
workflowId string WorkflowID that the LLM tried in this attempt
isValid boolean Whether validation passed (always false for failed attempts in history)
errors string array Validation errors encountered
timestamp Time When this attempt occurred

VerificationContext

VerificationContext captures EA verification results for completion notifications . Enables programmatic routing (e.g., inconclusive outcomes -> escalation channel).

Appears in: - NotificationContext

Field Type Description
assessed boolean Assessed indicates whether verification was performed at all.
outcome string Outcome is the high-level result: "passed", "completed", "partial", "inconclusive", "unavailable".
"completed" indicates all components were assessed but some scores < 1.0 .
reason string Reason maps to EffectivenessAssessment.Status.AssessmentReason.
summary string Summary is the operator-facing human-readable message.
degraded boolean Degraded indicates that the EA was unable to reliably compare pre- and
post-remediation state because hash capture failed .
Routing rules can match on this to escalate degraded notifications.
degradedReason string DegradedReason describes why the EA is degraded (e.g., RBAC Forbidden for
the target CRD). Empty when Degraded is false.

WorkflowContext

WorkflowContext captures selected workflow details.

Appears in: - NotificationContext

Field Type Description
selectedWorkflow string SelectedWorkflow is the ID of the workflow selected by AI.
confidence string Confidence is the AI confidence score (as string, e.g. "0.95").
workflowId string WorkflowID is the ID of the executed workflow.
executionEngine string ExecutionEngine is the engine used to execute the workflow.

WorkflowExecution

WorkflowExecution is the Schema for the workflowexecutions API

Field Type Description
apiVersion string kubernaut.ai/v1alpha1
kind string WorkflowExecution
metadata ObjectMeta Refer to the Kubernetes API documentation for fields of metadata.
spec WorkflowExecutionSpec
status WorkflowExecutionStatus

WorkflowExecutionSpec

WorkflowExecutionSpec defines the desired state of WorkflowExecution

Spec Immutability WorkflowExecution represents an immutable event (workflow execution attempt). Once created by RemediationOrchestrator, spec cannot be modified to ensure: - Audit trail integrity (executed spec matches approved spec) - No parameter tampering after HAPI validation - No target resource changes after routing decisions

To change execution parameters, delete and recreate the WorkflowExecution.

Appears in: - WorkflowExecution

Field Type Description
remediationRequestRef ObjectReference RemediationRequestRef references the parent RemediationRequest CRD
workflowRef WorkflowRef WorkflowRef contains the workflow catalog reference
Resolved from AIAnalysis.Status.SelectedWorkflow by RemediationOrchestrator
targetResource string TargetResource identifies the K8s resource being remediated
Used for resource locking - prevents parallel workflows on same target
Format: "namespace/kind/name" for namespaced resources
"kind/name" for cluster-scoped resources
Example: "payment/deployment/payment-api", "node/worker-node-1"
parameters object (keys:string, values:string) Parameters from LLM selection
Keys are UPPER_SNAKE_CASE for Tekton PipelineRun params
confidence float Confidence score from LLM (for audit trail)
rationale string Rationale from LLM (for audit trail)
serviceAccountName string ServiceAccountName is the pre-existing ServiceAccount for the execution
resource (Job, PipelineRun, or Ansible TokenRequest).
Operators pre-create SAs with appropriate RBAC in the execution namespace.
If absent, K8s assigns the namespace's default SA (Job/Tekton) or the
Ansible executor falls back to the controller's in-cluster credentials.
executionConfig ExecutionConfig ExecutionConfig contains minimal execution settings

WorkflowExecutionStatus

WorkflowExecutionStatus defines the observed state

Appears in: - WorkflowExecution

Field Type Description
observedGeneration integer ObservedGeneration is the most recent generation observed by the controller.
Used to prevent duplicate reconciliations and ensure idempotency.
Per Standard pattern for all Kubernetes controllers.
phase string Phase tracks current execution stage
V1.0: Skipped phase removed - RO makes routing decisions before WFE creation
startTime Time StartTime when execution started
completionTime Time CompletionTime when execution completed (success or failure)
duration Duration Duration of the execution
executionRef LocalObjectReference ExecutionRef references the created execution resource (PipelineRun or Job)
executionStatus ExecutionStatusSummary ExecutionStatus mirrors key execution resource status fields
failureReason string FailureReason explains why execution failed (if applicable)
DEPRECATED: Use FailureDetails for structured failure information
failureDetails FailureDetails FailureDetails contains structured failure information
Populated when Phase=Failed
blockClearance BlockClearanceDetails BlockClearance tracks the clearing of PreviousExecutionFailed blocks
When set, allows new executions despite previous execution failure
Preserves audit trail of WHO cleared the block and WHY
ephemeralCredentialIDs integer array EphemeralCredentialIDs stores AWX credential IDs created by the ansible
executor for cleanup after execution . Written via the status
subresource to avoid violating spec immutability .
executionEngine string ExecutionEngine is the backend engine resolved from the DS workflow catalog
at runtime by the WE controller. Set once during Pending phase via
WorkflowQuerier.GetWorkflowExecutionEngine; immutable thereafter.
Values: "tekton", "job", "ansible".
conditions Condition array Conditions provide detailed status information

WorkflowRef

WorkflowRef contains catalog-resolved workflow reference

Appears in: - WorkflowExecutionSpec

Field Type Description
workflowId string WorkflowID is the catalog lookup key
version string Version of the workflow
executionBundle string ExecutionBundle resolved from workflow catalog (Data Storage API)
OCI bundle reference for Tekton PipelineRun
executionBundleDigest string ExecutionBundleDigest for audit trail and reproducibility
engineConfig JSON EngineConfig holds engine-specific configuration .
For ansible: {"playbookPath": "...", "jobTemplateName": "...", "inventoryName": "..."}
For tekton/job: nil.

WorkflowReference

WorkflowReference captures workflow catalog information for audit trail. Used in RemediationRequestStatus.SelectedWorkflowRef .

Appears in: - RemediationRequestStatus

Field Type Description
workflowId string WorkflowID is the catalog lookup key
version string Version of the workflow
executionBundle string ExecutionBundle resolved from workflow catalog
OCI bundle reference for Tekton PipelineRun
executionBundleDigest string ExecutionBundleDigest for audit trail and reproducibility