Skip to main content
The swarm has three independent safety mechanisms. None of them can be overridden by any LLM or strategic layer. If any one triggers, everything stops.

Three Safety Layers

CriticalityMonitor

Compiled Rust. 15 systemic risk indicators run every 60 seconds. When criticality exceeds the threshold, it triggers the KillChain and deploys the avalanche reserve.

KillChain

Compiled Rust + Python callbacks. 6-phase automated incident response: detect, contain, preserve, diagnose, remediate, learn. Genome blacklisting prevents failed strategies from re-evolving.

Circuit Breaker

Python daemon thread. No LLM. Reads engine metrics every 2 seconds. Emergency stop with no override path.

CriticalityMonitor (Rust)

15 indicators computed from price series, volume data, and feed spreads:
IndicatorWhat It Detects
Correlation convergenceAll assets moving together (diversification collapse)
Contagion scoreStress spreading across markets
Hurst exponentLong-range dependence, trending vs mean-reverting
Two-scale volatilityMicrostructure noise vs true volatility
Market entropyInformation content of price moves
Transfer entropyCausal information flow between markets
VPINVolume-synchronized probability of informed trading
Liquidity scoreAvailable depth across venues
Spread wideningBid-ask deterioration
Amihud ratioPrice impact per unit volume
BOCPDBayesian online changepoint detection
CUSUMCumulative sum control chart for regime breaks
# The CriticalityMonitor runs automatically in the Hive oversight loop
report = hive.criticality.compute_report(
    price_series={"BTC-YES": [0.65, 0.66, 0.64, ...]},
    volume_series={"BTC-YES": [1000, 1200, 800, ...]},
    feed_spreads={"BTC-YES": [0.01, 0.02, 0.01, ...]},
)

print(f"Overall criticality: {report.overall_score}")
print(f"Avalanche triggered: {report.overall_score > config.avalanche_deploy_threshold}")
When overall_score exceeds avalanche_deploy_threshold (default 0.30), the Hive deploys the avalanche reserve (15% of capital held back for exactly this scenario) and triggers the KillChain.

KillChain (Rust + Python)

Six phases execute in sequence with Python callbacks at each step. Once triggered, all 6 phases run. The Hive can’t suppress an incident or skip phases.
from horizon._horizon import Severity

# Register response handlers
hive.kill_chain.on_contain(lambda incident: pause_affected_agents(incident))
hive.kill_chain.on_preserve(lambda incident: snapshot_full_state(incident))
hive.kill_chain.on_diagnose(lambda incident: identify_root_cause(incident))
hive.kill_chain.on_remediate(lambda incident: kill_and_blacklist(incident))
hive.kill_chain.on_learn(lambda incident: record_for_evolution(incident))

# Manual trigger (or automatic via CriticalityMonitor)
hive.kill_chain.trigger(
    Severity.Sev1Critical,
    "portfolio_drawdown_15pct",
    ["agent-001", "agent-002"],
    pnl_impact=-15000.0,
)
PhaseAction
DetectIdentify the incident type and severity
ContainPause affected agents, halt new orders
PreserveSnapshot full state for post-mortem
DiagnoseIdentify root cause (which agent, which strategy)
RemediateKill responsible agents, blacklist their genomes
LearnRecord the incident for evolutionary pressure (bad genomes can’t re-evolve)
Genome blacklisting is permanent. A strategy genome that caused a Sev1 incident will never be spawned again by the EvolutionEngine or AgentFactory.

Circuit Breaker (Python Thread)

A daemon thread with no LLM involvement. Reads numbers from the engines, compares to hard limits, acts. No reasoning, no prompts, no exceptions.

Why No LLM

An LLM might rationalize a losing position. Misinterpret drawdown as noise. Convince itself to override a limit. The circuit breaker doesn’t reason. It compares two numbers. If portfolio_drawdown > max_drawdown, everything stops.

How It Runs

A Python threading.Thread with daemon=True. Every 2 seconds:
  1. Reads portfolio metrics from every engine.
  2. Checks each metric against its limit.
  3. If any check fails, triggers an emergency stop.
  4. Publishes the result to the event bus.
CheckWhat It ReadsLimitAction
Portfolio drawdownPeak-to-trough P&L across all enginesmax_drawdown_pct (default 20%)Emergency stop all
Daily P&LSum of realized + unrealized P&L todayConfigurablePause all agents
Per-agent capitalAgent notional vs allocated capital1x allocationPause that agent
Stuck detectionTime since last event from an agent5 minutesFlag as stuck

Emergency Stop

When triggered:
  1. Cancels every open order across every exchange.
  2. Pauses all agents.
  3. Publishes an event with the failing metric, value, and limit.
  4. Only the human operator can restart.

Where They Sit in the Stack

LayerEnforcerWhatOverride
1Rust Engine8-point risk pipeline: kill switch, bounds, limits, cap, drawdown, rate, dedupNo
2SwarmCoordinator (Rust)Capital allocation per agent, fitness-weighted rebalancingNo
3KillChain (Rust)6-phase incident response with genome blacklistingNo
4Circuit Breaker (Python)Portfolio drawdown, daily P&L, stuck detectionNo
5AutonomyController (Rust)Progressive trust gating on all Hive decisionsCan’t override 1-4
Each layer runs independently. If the KillChain misses something, the circuit breaker catches it. If the circuit breaker thread dies, the Rust engine’s per-order risk pipeline still rejects every bad order.