Threshold
Runtime resilience and adversarial safety qualification for staged policy gating.
Last updated Mar 6, 2026
Layer: Agent (resilience + enforcement readiness)
Scale: 0-100
Production Tier: Staged policy rollout
Purpose
Threshold measures whether an agent remains safe and reliable under adversarial and stress conditions, and whether policy gating can be automated with acceptable false-positive and false-negative risk.
Why It Matters
- Drift/Fidelity/Mandate reveal risk and intervention posture.
- Threshold determines if enforcement automation is resilient enough to trust.
- Without Threshold, policy automation can become brittle in high-trust environments.
How It Works
Emits
Core Dimensions (Conceptual)
1. Injection Resistance
How well the agent resists prompt and instruction manipulation.
2. Tool/Peer Manipulation Resistance
How robust behavior remains under malicious or degraded tool interactions.
3. Data Poisoning Resilience
How robust outcomes remain when upstream evidence quality is degraded.
4. Stress and Load Stability
How reliably controls perform under high throughput and adverse operating conditions.
5. Control Stability Under Attack
Whether policy decisions remain consistent and explainable during hostile scenarios.
Public note: formulas, calibration constants, and adversarial profile internals are intentionally withheld.
Input Schema
| Field | Type | Required | Description |
|---|---|---|---|
agent_id | string | yes | Agent under assessment. |
stress_evidence | object[] | yes | Stress and adversarial test observations. |
runtime_context | object | yes | Runtime objective and control context. |
telemetry_quality | object | yes | Evidence confidence overlays. |
policy_profile | object | yes | Target enforcement profile and risk tolerances. |
Output Schema
| Field | Type | Description |
|---|---|---|
framework | string | threshold |
version | string | Threshold scoring spec version. |
entity_id | string | Evaluated agent identifier. |
score | number | Threshold score from 0 to 100. |
confidence | number | Confidence in resilience result (0 to 1). |
policy_readiness | object | Allowed policy modes by current evidence. |
recommended_mode | string | Suggested mode (alert, shadow, soft_block, hard_block). |
Staged Rollout
shadow/evaluate: compute signals and validate policy precision in live conditions.controlled soft_block: enforce selected controls with governance review.scoped hard_block: enforce strict controls once calibration criteria are met.