Three Safety Layers
CriticalityMonitor
Compiled Rust. 15 systemic risk indicators run every 60 seconds. When criticality exceeds the threshold, it triggers the KillChain and deploys the avalanche reserve.
KillChain
Compiled Rust + Python callbacks. 6-phase automated incident response: detect, contain, preserve, diagnose, remediate, learn. Genome blacklisting prevents failed strategies from re-evolving.
Circuit Breaker
Python daemon thread. No LLM. Reads engine metrics every 2 seconds. Emergency stop with no override path.
CriticalityMonitor (Rust)
15 indicators computed from price series, volume data, and feed spreads:| Indicator | What It Detects |
|---|---|
| Correlation convergence | All assets moving together (diversification collapse) |
| Contagion score | Stress spreading across markets |
| Hurst exponent | Long-range dependence, trending vs mean-reverting |
| Two-scale volatility | Microstructure noise vs true volatility |
| Market entropy | Information content of price moves |
| Transfer entropy | Causal information flow between markets |
| VPIN | Volume-synchronized probability of informed trading |
| Liquidity score | Available depth across venues |
| Spread widening | Bid-ask deterioration |
| Amihud ratio | Price impact per unit volume |
| BOCPD | Bayesian online changepoint detection |
| CUSUM | Cumulative sum control chart for regime breaks |
overall_score exceeds avalanche_deploy_threshold (default 0.30), the Hive deploys the avalanche reserve (15% of capital held back for exactly this scenario) and triggers the KillChain.
KillChain (Rust + Python)
Six phases execute in sequence with Python callbacks at each step. Once triggered, all 6 phases run. The Hive can’t suppress an incident or skip phases.| Phase | Action |
|---|---|
| Detect | Identify the incident type and severity |
| Contain | Pause affected agents, halt new orders |
| Preserve | Snapshot full state for post-mortem |
| Diagnose | Identify root cause (which agent, which strategy) |
| Remediate | Kill responsible agents, blacklist their genomes |
| Learn | Record the incident for evolutionary pressure (bad genomes can’t re-evolve) |
Circuit Breaker (Python Thread)
A daemon thread with no LLM involvement. Reads numbers from the engines, compares to hard limits, acts. No reasoning, no prompts, no exceptions.Why No LLM
An LLM might rationalize a losing position. Misinterpret drawdown as noise. Convince itself to override a limit. The circuit breaker doesn’t reason. It compares two numbers. Ifportfolio_drawdown > max_drawdown, everything stops.
How It Runs
A Pythonthreading.Thread with daemon=True. Every 2 seconds:
- Reads portfolio metrics from every engine.
- Checks each metric against its limit.
- If any check fails, triggers an emergency stop.
- Publishes the result to the event bus.
| Check | What It Reads | Limit | Action |
|---|---|---|---|
| Portfolio drawdown | Peak-to-trough P&L across all engines | max_drawdown_pct (default 20%) | Emergency stop all |
| Daily P&L | Sum of realized + unrealized P&L today | Configurable | Pause all agents |
| Per-agent capital | Agent notional vs allocated capital | 1x allocation | Pause that agent |
| Stuck detection | Time since last event from an agent | 5 minutes | Flag as stuck |
Emergency Stop
When triggered:- Cancels every open order across every exchange.
- Pauses all agents.
- Publishes an event with the failing metric, value, and limit.
- Only the human operator can restart.
Where They Sit in the Stack
| Layer | Enforcer | What | Override |
|---|---|---|---|
| 1 | Rust Engine | 8-point risk pipeline: kill switch, bounds, limits, cap, drawdown, rate, dedup | No |
| 2 | SwarmCoordinator (Rust) | Capital allocation per agent, fitness-weighted rebalancing | No |
| 3 | KillChain (Rust) | 6-phase incident response with genome blacklisting | No |
| 4 | Circuit Breaker (Python) | Portfolio drawdown, daily P&L, stuck detection | No |
| 5 | AutonomyController (Rust) | Progressive trust gating on all Hive decisions | Can’t override 1-4 |