Skip to content

Sentinel Engine

Sentinel is the core governance brain of TheWARDN. It is the evaluation engine that examines every AI action against a combination of hardcoded principles, configurable policies, tier classifications, and contextual signals to produce a binding verdict.

Sentinel cannot be disabled. It cannot be bypassed. If Sentinel fails to evaluate an action, that action is BLOCKED -- never silently approved. This fail-closed guarantee is the foundation of the entire governance architecture.

The 21 Sentinel Governance Principles (SGP)

The SGP are hardcoded rules that exist above all configurable policies. No CHAM policy, tenant configuration, or administrative action can override an SGP. They represent the absolute boundaries of what any AI agent is permitted to do under TheWARDN governance.

The principles are organized into three books:

Book I: Operational Governance (SGP 1-7)

SGPPrincipleEffect
SGP-1Verdict Before ActionNo action may execute before receiving a governance verdict. Silence is never consent.
SGP-2Audit CompletenessEvery governed action must produce a complete, hash-chained audit record. No exceptions.
SGP-3Fail-Closed EvaluationIf the evaluation pipeline encounters any error, the verdict is BLOCKED.
SGP-4Deterministic VerdictsSame inputs, same policies, same state must produce the same verdict.
SGP-5Non-Retroactive ApprovalAn action that was BLOCKED cannot be retroactively changed to CLEARED. A new action must be submitted.
SGP-6Escrow IntegrityEscrow records cannot be modified after creation. Release or kill only.
SGP-7Hash Chain ContinuityThe audit hash chain must be continuous. A gap in the chain is a governance failure.

Book II: Human Protection (SGP 8-14)

SGPPrincipleEffect
SGP-8Human Override AuthorityA human can always override a HELD verdict (release or kill). Agents cannot override escrow.
SGP-9Escalation PathEvery BLOCKED action must include a clear reason and path for human review.
SGP-10Data SovereigntyActions that move, copy, or expose personal data require explicit governance with elevated confidence thresholds.
SGP-11Reversibility PreferenceIrreversible actions require higher tier classification than reversible ones.
SGP-12Proportional ResponseThe severity of the governance response must be proportional to the risk of the action.
SGP-13Transparency of ReasoningEvery verdict must include human-readable reasoning. Black-box verdicts are prohibited.
SGP-14No Silent FailureIf an agent fails to receive a verdict, it must not proceed. Timeout = BLOCKED.

Book III: Identity & Integrity (SGP 15-21)

SGPPrincipleEffect
SGP-15Identity VerificationEvery action must be traceable to a registered, verified agent identity.
SGP-16Immutable IdentityAn agent's core identity (ID, registration timestamp, creator) cannot be modified after creation.
SGP-17Prohibited Action ClassesCertain action classes (data destruction in production, credential exfiltration, governance circumvention) are permanently prohibited regardless of tier or policy.
SGP-18Governance Self-ProtectionNo agent action may modify, disable, or degrade the governance system itself.
SGP-19Confidence HonestyAgents must not inflate confidence scores. Detected inflation triggers immediate BLOCKED + investigation.
SGP-20Cross-Agent IsolationOne agent's governance context must not leak to or influence another agent's evaluation.
SGP-21X_ABSOLUTEPermanent prohibition on any reasoning that attempts to remove, weaken, or circumvent governance. Semantic scan, immediate termination, autonomy reset to L0.

WARNING

SGP-21 (X_ABSOLUTE) is the most critical principle. Any action whose reasoning contains patterns that suggest governance removal -- even if the action itself is benign -- triggers an immediate BLOCKED verdict, a Tier X violation, and an autonomy reset. This principle exists to prevent the class of attacks where an AI agent reasons its way out of its own guardrails.

Supremacy Ordering

When SGP principles conflict, the ordering is:

Book II (Human Protection) > Book III (Identity) > Book I (Operational)

Human safety always takes precedence over identity integrity, which always takes precedence over operational governance rules.

Evaluation Process

When an action enters Sentinel for evaluation, the following checks execute in order:

1. SGP Scan

The action is scanned against all 21 SGP principles. This is a hard-fail check -- if any SGP is violated, the action is immediately BLOCKED with tier X and the evaluation terminates. No further checks are needed.

Action --> SGP-1 check --> SGP-2 check --> ... --> SGP-21 check
              |                |                      |
           (pass)           (pass)                 (pass)
              |                |                      |
           (fail)           (fail)                 (fail)
              |                |                      |
              v                v                      v
         BLOCKED X         BLOCKED X              BLOCKED X

2. Tier Resolution

If the action passes the SGP scan, Sentinel resolves its tier:

  1. Check for agent-level tier override -- if the agent has a forced tier, use it.
  2. Check for action-type tier mapping -- look up the configured tier for this action type.
  3. Apply environment escalation -- production environments may escalate the tier (e.g., A in staging becomes B in production).
  4. Apply CHAM policy overrides -- policies like environment_restriction can force tier escalation.

The resolved tier determines the verdict path:

Resolved TierVerdict
A (Autonomous)CLEARED -- proceed to confidence check
B (Supervised)HELD -- will be placed in escrow after confidence check
C (Controlled)BLOCKED -- policy-level block
X (Prohibited)BLOCKED -- SGP-level block (already caught in step 1)

3. Confidence Evaluation

For actions that resolve to Tier A or B, Sentinel evaluates the agent's reported confidence scores against configured thresholds.

Confidence is measured across three dimensions:

DimensionWhat It Measures
incidentHow confident the agent is in its understanding of the situation.
fixHow confident the agent is that its proposed action will resolve the issue.
containmentHow confident the agent is that the action will not cause collateral damage.

Each dimension is compared against the configured floor (from CHAM confidence_floor policies or the agent's own confidence_floor setting):

python
# Pseudocode for confidence evaluation
for dimension in ["incident", "fix", "containment"]:
    if action.confidence[dimension] < policy.confidence_floor[dimension]:
        # Confidence too low -- escalate tier
        if resolved_tier == "A":
            resolved_tier = "B"  # Escalate to supervised
        elif resolved_tier == "B":
            resolved_tier = "C"  # Escalate to blocked

TIP

Confidence evaluation can escalate a tier upward (A to B, B to C) but never downward. An action classified as Tier B by its action type can never be demoted to Tier A by high confidence alone.

4. Change Window Check

For actions targeting specific environments, Sentinel checks whether the current time falls within an approved change window:

  • If a time_window CHAM policy exists for the action's environment and the current time is outside the window, the tier is escalated.
  • If no time_window policy exists, this check is skipped.

5. CHAM Policy Evaluation

All loaded CHAM policies that were not already evaluated in boundary checks are evaluated here. This includes:

  • require_reasoning -- if the action has no reasoning field and this policy is active, BLOCKED.
  • compliance_pack -- industry-specific rule sets that add additional checks.
  • action_type_block -- explicit blocks on specific action types.
  • Any custom policy logic.

Every policy that fires (whether it changes the verdict or not) is recorded in the policies_fired array of the response.

6. Verdict Assembly

Sentinel assembles the final verdict:

json
{
  "verdict": "CLEARED | HELD | BLOCKED",
  "tier": "A | B | C | X",
  "reasoning": "Human-readable explanation of why this verdict was reached",
  "policies_fired": ["policy_id_1", "policy_id_2"],
  "rule_violated": "SGP-17 | policy_id | null",
  "confidence": {
    "incident": 0.92,
    "fix": 0.87,
    "containment": 0.95
  }
}

Fail-Closed Guarantee

The fail-closed property is enforced at every level:

Failure ScenarioResult
Sentinel process crashesBLOCKED
Database unreachableBLOCKED
Redis unavailableBLOCKED
CHAM policy load failsBLOCKED
Confidence data missingBLOCKED (treated as 0.0 confidence)
Unknown action typeBLOCKED
Unregistered agentBLOCKED
Hash chain verification failsBLOCKED + governance alert
Any unhandled exceptionBLOCKED

WARNING

There is no code path through Sentinel that produces a CLEARED verdict by default. Every CLEARED verdict is the result of explicitly passing every check. This is by design -- the system is deny-by-default.

WHO_I_AM Identity Concept

Sentinel carries a cryptographic identity concept called WHO_I_AM. This is an immutable payload that is injected into every evaluation context and verified before each evaluation cycle. WHO_I_AM contains:

  • The Founding Letter (the architect's intent and purpose)
  • The 21 SGP directives
  • The specialist identity of the Sentinel instance

If WHO_I_AM verification fails -- if the payload has been tampered with or is missing -- Sentinel refuses to evaluate and all actions are BLOCKED. This prevents a class of attacks where the governance engine itself is compromised.

Performance

Sentinel is designed for low-latency governance:

MetricTarget
SGP scan< 5ms
Tier resolution< 2ms
Confidence evaluation< 2ms
CHAM policy evaluation< 10ms
Full pipeline (no escrow)< 50ms

These targets are for in-memory evaluation with policies pre-loaded. Database-bound operations (audit sealing, escrow creation) add additional latency but are asynchronous where possible.

AI Governance for Every Organization