L1: Prompt Governance
L1 monitors and governs the prompt layer -- the entry point where human instructions meet AI models. This is your first line of defense against prompt injection, data leakage, and policy violations before they reach the model.
Capabilities
Prompt Injection Detection
L1 scans inbound prompts for injection patterns -- attempts to override system instructions, extract hidden prompts, or manipulate model behavior through crafted input.
Detected: Prompt injection attempt
Pattern: OVERRIDE_SYSTEM_INSTRUCTION
Severity: HIGH
Verdict: BLOCKEDPII Exposure Detection
Scans prompts for personally identifiable information (PII) before they are sent to any AI model. Catches social security numbers, credit card numbers, email addresses, phone numbers, and other sensitive data patterns.
Data Leaves Your Control at the Prompt
Once PII enters a prompt, it may be logged by the model provider, used in training, or stored in ways you cannot control. L1 catches it before it leaves.
Jailbreak Attempt Detection
Identifies known and emerging jailbreak patterns -- attempts to bypass model safety guardrails through role-playing, encoding tricks, or adversarial prompt construction.
Prompt Template Compliance
Enforces that prompts conform to approved templates. This ensures consistency across teams and prevents ad-hoc prompts that bypass organizational standards.
{
"template_id": "customer-support-v2",
"required_fields": ["context", "question"],
"forbidden_patterns": ["ignore previous instructions"],
"max_length": 4096
}Prompt Logging and Audit
Every prompt is logged with metadata -- timestamp, user, model target, template used, and governance verdict. These records feed into the unified audit trail for compliance reporting.
Console Features
The L1 console provides three workflow tabs:
Monitor
View real-time prompt activity across your organization. The dashboard shows:
- Total prompts processed (24h / 7d / 30d)
- Prompts flagged by category (injection, PII, jailbreak, template violation)
- Top users by prompt volume
- Trend charts for flagged activity
Configure
Set up and manage prompt governance rules:
| Rule Type | Description | Example |
|---|---|---|
| Injection Patterns | Regex or semantic patterns to detect | "ignore all previous", "you are now" |
| PII Categories | Which PII types to scan for | SSN, credit card, email, phone |
| Template Requirements | Enforce approved prompt templates | Require customer-support-v2 for support agents |
| Length Limits | Maximum prompt length by role or use case | 4096 tokens for standard, 8192 for analysts |
| Blocklists | Specific phrases or topics to block | Competitor names, restricted topics |
Act
Enforcement controls and incident response:
- Enable or disable enforcement per rule
- Set verdict behavior: APPROVED, HELD (pending review), or BLOCKED
- Configure alert destinations (email, Slack, webhook)
- Review and release held prompts
Operating Modes
| Mode | Behavior |
|---|---|
| Monitor | All prompts are logged and scanned. Violations are recorded but not enforced. Prompts pass through to the model. |
| Configure | Rules are active and violations are flagged. Prompts that trigger rules are held for review before being sent to the model. |
| Act | Full enforcement. Prompts that violate rules are blocked immediately. Users receive a governance denial message. |
Start with Monitor
Deploy L1 in Monitor mode first to establish a baseline of prompt activity. Use the data to tune rules before enabling enforcement, reducing false positives and user friction.
Integration
L1 integrates into the /govern pipeline. When a prompt is submitted for governance, L1 evaluates it and contributes its verdict to the overall governance decision:
POST /govern
{
"action": "prompt.submit",
"payload": {
"prompt": "...",
"user_id": "usr_abc123",
"model_target": "gpt-4o",
"template_id": "customer-support-v2"
}
}L1 returns its layer-specific verdict, which is combined with other active layers to produce the final governance decision.
Related Layers
- L2: Reasoning Governance -- validates the reasoning chain after the model responds
- L4: Content Verification -- scans the output content for compliance
- L7: Shadow AI Detection -- catches prompts sent to unapproved tools that bypass L1 entirely