Skip to content

L1: Prompt Governance

L1 monitors and governs the prompt layer -- the entry point where human instructions meet AI models. This is your first line of defense against prompt injection, data leakage, and policy violations before they reach the model.

Capabilities

Prompt Injection Detection

L1 scans inbound prompts for injection patterns -- attempts to override system instructions, extract hidden prompts, or manipulate model behavior through crafted input.

Detected: Prompt injection attempt
Pattern:  OVERRIDE_SYSTEM_INSTRUCTION
Severity: HIGH
Verdict:  BLOCKED

PII Exposure Detection

Scans prompts for personally identifiable information (PII) before they are sent to any AI model. Catches social security numbers, credit card numbers, email addresses, phone numbers, and other sensitive data patterns.

Data Leaves Your Control at the Prompt

Once PII enters a prompt, it may be logged by the model provider, used in training, or stored in ways you cannot control. L1 catches it before it leaves.

Jailbreak Attempt Detection

Identifies known and emerging jailbreak patterns -- attempts to bypass model safety guardrails through role-playing, encoding tricks, or adversarial prompt construction.

Prompt Template Compliance

Enforces that prompts conform to approved templates. This ensures consistency across teams and prevents ad-hoc prompts that bypass organizational standards.

json
{
  "template_id": "customer-support-v2",
  "required_fields": ["context", "question"],
  "forbidden_patterns": ["ignore previous instructions"],
  "max_length": 4096
}

Prompt Logging and Audit

Every prompt is logged with metadata -- timestamp, user, model target, template used, and governance verdict. These records feed into the unified audit trail for compliance reporting.

Console Features

The L1 console provides three workflow tabs:

Monitor

View real-time prompt activity across your organization. The dashboard shows:

  • Total prompts processed (24h / 7d / 30d)
  • Prompts flagged by category (injection, PII, jailbreak, template violation)
  • Top users by prompt volume
  • Trend charts for flagged activity

Configure

Set up and manage prompt governance rules:

Rule TypeDescriptionExample
Injection PatternsRegex or semantic patterns to detect"ignore all previous", "you are now"
PII CategoriesWhich PII types to scan forSSN, credit card, email, phone
Template RequirementsEnforce approved prompt templatesRequire customer-support-v2 for support agents
Length LimitsMaximum prompt length by role or use case4096 tokens for standard, 8192 for analysts
BlocklistsSpecific phrases or topics to blockCompetitor names, restricted topics

Act

Enforcement controls and incident response:

  • Enable or disable enforcement per rule
  • Set verdict behavior: APPROVED, HELD (pending review), or BLOCKED
  • Configure alert destinations (email, Slack, webhook)
  • Review and release held prompts

Operating Modes

ModeBehavior
MonitorAll prompts are logged and scanned. Violations are recorded but not enforced. Prompts pass through to the model.
ConfigureRules are active and violations are flagged. Prompts that trigger rules are held for review before being sent to the model.
ActFull enforcement. Prompts that violate rules are blocked immediately. Users receive a governance denial message.

Start with Monitor

Deploy L1 in Monitor mode first to establish a baseline of prompt activity. Use the data to tune rules before enabling enforcement, reducing false positives and user friction.

Integration

L1 integrates into the /govern pipeline. When a prompt is submitted for governance, L1 evaluates it and contributes its verdict to the overall governance decision:

POST /govern
{
  "action": "prompt.submit",
  "payload": {
    "prompt": "...",
    "user_id": "usr_abc123",
    "model_target": "gpt-4o",
    "template_id": "customer-support-v2"
  }
}

L1 returns its layer-specific verdict, which is combined with other active layers to produce the final governance decision.

AI Governance for Every Organization