SOC2 Alignment¶
Kubernaut's audit pipeline is designed to align with SOC2 Type II trust service criteria for automated remediation systems. This page maps Kubernaut's audit capabilities to specific SOC2 controls and explains how each control is addressed.
Not a certification claim
This page documents how Kubernaut's audit architecture aims to support SOC2 compliance. It is not a certification claim. Formal SOC2 Type II compliance requires an independent auditor's assessment of the operating organization's controls, processes, and evidence -- which is outside the scope of Kubernaut itself. Items with version annotations (e.g., "planned for v1.2") indicate capabilities not yet available in the current release.
Scope
This mapping covers Kubernaut's internal audit capabilities -- the audit events it generates, how they are stored, and what integrity guarantees it provides. Organizational controls (access policies, HR processes, vendor management) are outside Kubernaut's scope and must be addressed by the operating organization's SOC2 program.
Authoritative Sources¶
The control mapping on this page is derived from the following internal design documents in the kubernaut repository:
| Document | Scope |
|---|---|
| BR-AUDIT-006 | Remediation approval audit trail |
| BR-AUDIT-021-030 | Workflow selection audit trail |
| BR-GATEWAY-058 | Audit event emission requirements |
| BR-WE-013 | Audit-tracked block clearing |
| ADR-034 | Unified audit table design (event sourcing, hash chain, partitioning) |
| ADR-038 | Async buffered audit ingestion (fire-and-forget, DLQ) |
| DD-AUDIT-001 | Audit responsibility pattern |
| DD-AUDIT-002 | Shared audit library design |
| DD-AUDIT-003 | Per-service audit trace requirements (P0/P1 classification) |
| DD-WEBHOOK-003 | Webhook complete audit pattern (operator attribution) |
Control Mapping¶
CC6.8 -- Non-Repudiation¶
Requirement: Actions cannot be denied by the actor after the fact.
How Kubernaut addresses this:
- Every human action (approval, rejection, block clearance, workflow registration, action type management) is captured by the Auth Webhook using
req.UserInfo.Usernamefrom Kubernetes admission control. The operator identity comes from Kubernetes authentication -- it cannot be spoofed or omitted (DD-WEBHOOK-003). - Audit events are immutable and append-only. Once written to the
audit_eventsPostgreSQL table, events cannot be modified or deleted (deletion is deferred to retention expiry, not yet implemented) (ADR-034). - Events form a SHA256 hash chain (
event_hash = SHA256(previous_event_hash + canonical_event_json)), enabling after-the-fact detection of any modification or deletion (ADR-034).
Evidence: Operator Attribution, Hash Chain
CC7.2 -- Monitoring Activities¶
Requirement: The organization monitors system components and detects anomalies.
How Kubernaut addresses this:
- 9 services emit structured audit events covering the full remediation lifecycle, from signal ingestion through effectiveness assessment. Every phase transition, routing decision, approval, execution, and notification is recorded (DD-AUDIT-003).
- The workflow discovery audit trail records every step of the LLM's 3-step selection protocol: which action types were listed, which workflows were scored, which workflow was retrieved, and whether the selection passed validation (BR-AUDIT-021-030).
- All events share a
correlation_id(the RemediationRequest name), enabling complete timeline reconstruction for any remediation. - ~90 Prometheus metric families (46 Kubernaut-specific plus controller-runtime built-ins across 6 controllers) provide real-time operational monitoring of throughput, latency, errors, audit health, and LLM cost.
Evidence: What Gets Audited, Emitting Services, Metrics
CC7.3 -- Immutability and Retention¶
Requirement: Audit records are protected from unauthorized modification and retained for an appropriate period.
How Kubernaut addresses this:
- Audit events are stored in PostgreSQL with a default retention of 2,555 days (7 years), recorded per event in the
retention_daysfield. - The
audit_eventstable is append-only by design (event sourcing pattern, ADR-034). There are no UPDATE or DELETE operations exposed through the DataStorage API. - The hash chain provides cryptographic tamper detection. Any modification to a stored event breaks the chain and is detectable by walking the hash sequence (ADR-034).
- The table is partitioned by month for efficient storage and querying at scale.
Retention enforcement
The retention period is recorded per event, but automatic deletion of expired events is not yet implemented. Events currently accumulate indefinitely -- retention is effectively unlimited. Retention enforcement is tracked in kubernaut#485 (v1.3). Customers are responsible for configuring retention periods based on their jurisdictional and regulatory requirements (e.g., GDPR, HIPAA, SOC2, SEC).
Evidence: Retention, Event Structure
CC7.4 -- Completeness¶
Requirement: Audit records capture all relevant events without gaps.
How Kubernaut addresses this:
- Audit emission is fire-and-forget with retry (ADR-038). If DataStorage is temporarily unavailable, events are retried (3 attempts with quadratic backoff). If retries are exhausted, the batch is sent to a Valkey Dead Letter Queue (DLQ) for later reprocessing.
- Business logic never blocks on audit writes. This means a transient audit failure cannot prevent remediation from proceeding, and the DLQ ensures events are not silently lost.
- Every service uses the same shared audit library (
pkg/audit/) (DD-AUDIT-002), ensuring consistent event structure, batching, and retry behavior across all services. - The
correlation_idlinks all events for a single remediation, making it possible to detect gaps by querying for a remediation and checking whether all expected event types are present.
Evidence: Buffered Audit Store, Dead Letter Queue
CC8.1 -- User Attribution¶
Requirement: Actions are attributable to a specific individual.
How Kubernaut addresses this:
-
The Auth Webhook captures operator identity for every human-initiated action through Kubernetes admission control (DD-WEBHOOK-003):
Action Event Type Approve/reject remediation webhook.remediationapprovalrequest.decidedClear execution block workflowexecution.block.clearedModify timeout webhook.remediationrequest.timeout_modifiedCancel notification webhook.notification.cancelledRegister/delete workflow remediationworkflow.admitted.create,remediationworkflow.admitted.deleteCreate/update/delete action type actiontype.admitted.create,actiontype.admitted.update,actiontype.admitted.delete -
System-initiated actions (controller reconciliation, LLM analysis) are attributed to the service name (
actor_type: service,actor_id: ai-analysis) (DD-AUDIT-001). - Each event records
actor_type(service or human),actor_id, andevent_timestamp.
Evidence: Operator Attribution, Operator Attribution (architecture)
CC4.2 -- Change Tracking¶
Requirement: Changes to system components are tracked.
How Kubernaut addresses this:
- Every CRD phase transition is recorded as an audit event with the previous and current phase. Each emitting service follows the responsibility pattern defined in DD-AUDIT-001, where the controller that owns the phase transition is responsible for emitting the event.
- Workflow catalog changes (create, update, disable, re-enable) are audited by both the Auth Webhook (operator identity per DD-WEBHOOK-003) and DataStorage (catalog state change). This dual-event pattern ensures both the "who" and the "what" are captured independently.
- Action type taxonomy changes follow the same dual-event pattern.
- Block clearance events record the original block reason, the clearing operator, and the justification (minimum 10-character reason required, BR-WE-013).
- Approval decisions record the operator, decision (approve/reject), and rationale in an immutable event (BR-AUDIT-006).
Evidence: Emitting Services, Operator Attribution, What Gets Audited
AU-2 -- Auditable Events¶
Requirement: The organization determines which events are auditable.
How Kubernaut addresses this:
Kubernaut defines auditable events per service in DD-AUDIT-003. Seven services have mandatory (P0) audit requirements; two have recommended (P1). Event types are hierarchically named (service.category.action) and documented in the audit event catalog.
AU-9 -- Audit Protection¶
Requirement: Audit records are protected from unauthorized access, modification, and deletion.
How Kubernaut addresses this:
- The DataStorage REST API does not expose UPDATE or DELETE endpoints for audit events.
- PostgreSQL access is restricted to the DataStorage service's database credentials.
- The hash chain provides a cryptographic integrity guarantee that is independent of database access controls (ADR-034).
- Legal hold support (
legal_hold,legal_hold_reason,legal_hold_set_at,legal_hold_set_by) is available at the event level to prevent retention-based deletion of specific events.
CLI verification tools (v1.1)
Client-side hash chain verification and digital signature verification CLI tools are planned for v1.2. Server-side integrity verification is available through the DataStorage API.
AC-2 -- Account Management¶
Requirement: Actions are linked to authenticated user accounts.
How Kubernaut addresses this:
- All human actions flow through Kubernetes admission webhooks, which extract the authenticated identity from
req.UserInfo.Username(DD-WEBHOOK-003). - All signal ingestion requests are authenticated via Kubernetes TokenReview + SubjectAccessReview (BR-GATEWAY-058). The Gateway rejects unauthenticated signals.
- Audit events record the authenticated identity, linking every action to a specific Kubernetes ServiceAccount or user.
Evidence: Signal Source Authentication, Operator Attribution
Reconstruction and Evidence Production¶
For audits and compliance reviews, Kubernaut can reconstruct a complete RemediationRequest from audit data -- even after the Kubernetes CRD has been deleted:
This produces the full remediation record: what signal triggered it, what the LLM investigated, what workflow was selected, who approved it, what the execution did, and whether it was effective. See Data Lifecycle: Reconstruction for details.
Summary¶
| SOC2 Control | Kubernaut Capability | Key Sources |
|---|---|---|
| CC6.8 Non-Repudiation | Immutable events, SHA256 hash chain, webhook-captured operator identity | ADR-034, DD-WEBHOOK-003 |
| CC7.2 Monitoring | 9 services, full lifecycle audit, workflow discovery trail, Prometheus metrics | DD-AUDIT-003, BR-AUDIT-021-030 |
| CC7.3 Immutability/Retention | Append-only table, 7-year retention, hash chain integrity | ADR-034 |
| CC7.4 Completeness | Fire-and-forget with retry, Valkey DLQ, shared audit library | ADR-038, DD-AUDIT-002 |
| CC8.1 Attribution | Auth Webhook captures identity for all human actions | DD-WEBHOOK-003 |
| CC4.2 Change Tracking | Phase transitions, catalog changes, block clearance, approval decisions | DD-AUDIT-001, BR-WE-013, BR-AUDIT-006 |
| AU-2 Auditable Events | Defined per service (7 mandatory, 2 recommended) | DD-AUDIT-003 |
| AU-9 Audit Protection | No UPDATE/DELETE API, hash chain, legal hold support | ADR-034 |
| AC-2 Account Management | Kubernetes TokenReview + admission webhook identity | DD-WEBHOOK-003, BR-GATEWAY-058 |
Next Steps¶
- Audit & Observability -- What gets audited and how to query it
- Architecture: Audit Pipeline -- Technical deep-dive into the audit system
- Data Lifecycle -- CRD retention and reconstruction from audit data
- Configuration Reference -- Signal source authentication setup