Pro Feature. Requires a Pro or Ultra subscription. Get started at api.mathematicalcompany.com
Bars & Labeling
Horizon implements the full information-driven bar sampling and triple barrier labeling pipeline from Marcos Lopez de Prado’s Advances in Financial Machine Learning (Chapters 2-3). All functions run in Rust for maximum throughput on tick-level data.7 Bar Types
Tick, volume, dollar, tick/volume imbalance, and tick/volume run bars. All Rust-native.
Triple Barrier
Profit-taking, stop-loss, and vertical barriers with volatility-scaled thresholds.
CUSUM Filter
Symmetric CUSUM for event-driven sampling of structural breaks.
Meta-Labels
Binary 0/1 labels for bet sizing on top of a primary directional model.
Information-Driven Bars
Standard time bars (1-minute, 5-minute) sample at fixed clock intervals, which over-samples quiet periods and under-samples volatile ones. Information-driven bars instead sample based on market activity, producing bars that carry roughly equal information content. Horizon provides three standard bar types and four information-driven bar types:- Standard Bars
- Imbalance Bars
- Run Bars
| Parameter | Type | Description |
|---|---|---|
timestamps | list[float] | Tick timestamps (e.g., Unix epoch seconds) |
prices | list[float] | Tick prices |
volumes | list[float] | Tick volumes |
Bar Type
Every bar function returnslist[Bar]. Each Bar object has the following fields:
| Field | Type | Description |
|---|---|---|
timestamp | float | Timestamp of the first tick in the bar |
open | float | Opening price (first tick) |
high | float | Highest price in the bar |
low | float | Lowest price in the bar |
close | float | Closing price (last tick) |
volume | float | Total volume in the bar |
vwap | float | Volume-weighted average price |
n_ticks | int | Number of ticks in the bar |
If the last group of ticks does not fully meet the bar threshold, a partial bar is still emitted. This ensures no tick data is silently dropped.
Function Reference
| Function | Threshold Parameter | Threshold Meaning |
|---|---|---|
hz.tick_bars | threshold: int | Number of ticks per bar |
hz.volume_bars | threshold: float | Cumulative volume per bar |
hz.dollar_bars | threshold: float | Cumulative dollar volume per bar |
hz.tick_imbalance_bars | initial_estimate: float | Initial expected tick imbalance threshold (adapts via EWM) |
hz.volume_imbalance_bars | initial_estimate: float | Initial expected volume imbalance threshold (adapts via EWM) |
hz.tick_run_bars | initial_estimate: float | Initial expected max tick run length (adapts via EWM) |
hz.volume_run_bars | initial_estimate: float | Initial expected max volume run (adapts via EWM) |
Example: Comparing Bar Types
Triple Barrier Labeling
The triple barrier method (AFML Ch. 3) labels each trading event with the outcome of three competing barriers:- Profit-taking (PT): upper barrier, price rises by a volatility-scaled amount
- Stop-loss (SL): lower barrier, price falls by a volatility-scaled amount
- Vertical barrier (VB): maximum holding period expires
Step 1: Compute Daily Volatility
prices. The first element is always 0.0 (no return available from a single price).
| Parameter | Type | Description |
|---|---|---|
prices | list[float] | Raw price series (at least 2 elements) |
span | int | EWM lookback (e.g., 20 for ~20-day half-life) |
Step 2: CUSUM Filter for Structural Breaks
The symmetric CUSUM filter (AFML Ch. 2.5.2.1) produces a structurally meaningful subsample of the time series by detecting significant price moves and filtering out noise.| Parameter | Type | Description |
|---|---|---|
prices | list[float] | Raw price series |
threshold | float | Trigger threshold in price units (must be positive) |
Step 3: Apply Triple Barrier Labels
| Parameter | Type | Description |
|---|---|---|
prices | list[float] | Raw price series (at least 2 elements) |
timestamps | list[float] | Timestamps for each price (same length as prices) |
events | list[int] | Event indices from cusum_filter or user-provided |
pt_sl | [float, float] | [profit_taking_multiplier, stop_loss_multiplier]; set to 0.0 to disable |
min_ret | float | Minimum absolute return to assign +1/-1 (below this, label = 0) |
max_holding | int | Vertical barrier in bars forward from entry (0 = no vertical barrier) |
vol_span | int | Lookback span for EWM daily volatility estimation |
- Upper:
entry_price * (1 + daily_vol * pt_sl[0]) - Lower:
entry_price * (1 - daily_vol * pt_sl[1])
Step 4: Meta-Labels (Bet Sizing)
Meta-labeling (AFML Ch. 3.6) determines whether a primary model’s directional signals are correct. The primary model provides the direction (+1 long, -1 short), and the meta-label indicates if acting on that signal is profitable (1) or not (0).| Parameter | Type | Description |
|---|---|---|
prices | list[float] | Raw price series |
timestamps | list[float] | Timestamps (same length as prices) |
primary_signals | list[(int, int)] | List of (event_idx, side) where side is +1 or -1 |
pt_sl | [float, float] | [profit_taking_mult, stop_loss_mult] |
max_holding | int | Vertical barrier in bars (0 = no vertical barrier) |
vol_span | int | Lookback span for daily vol estimation |
Step 5: Drop Rare Labels
Remove label classes that appear less than a minimum percentage of the total. This prevents training classifiers on heavily imbalanced datasets.| Parameter | Type | Description |
|---|---|---|
labels | list[BarrierLabel] | Labels from triple_barrier_labels or meta_labels |
min_pct | float | Minimum fraction of total (e.g., 0.05 = 5%). Must be in [0, 1]. |
BarrierLabel Type
Every label function returnslist[BarrierLabel]. Each object has:
| Field | Type | Description |
|---|---|---|
event_idx | int | Index in the original price series where the event started |
label | int | -1 (stop-loss), 0 (vertical/below min_ret), +1 (profit-taking). For meta-labels: 0 or 1. |
ret | float | Log return from entry to barrier touch |
barrier | str | Which barrier was touched: "pt", "sl", or "vb" |
touch_idx | int | Index in the price series where the barrier was touched |
Full Pipeline Example
Choosing Bar Types
When to use tick bars
When to use tick bars
Tick bars are the simplest non-time-based alternative. Each bar contains a fixed number of trades, so bars arrive faster during active periods and slower during quiet periods. Good baseline for comparison.
When to use volume/dollar bars
When to use volume/dollar bars
Volume bars normalize by trading volume, and dollar bars normalize by dollar volume. Dollar bars are preferred when price varies significantly over the sample period, as they keep the economic significance of each bar roughly constant.
When to use imbalance bars
When to use imbalance bars
Imbalance bars detect asymmetry in order flow. They produce more bars when one side (buy or sell) dominates, which often coincides with informed trading activity. Use these when order flow toxicity matters for your strategy.
When to use run bars
When to use run bars
Run bars detect sustained sequences of same-direction ticks. They are sensitive to persistent buying or selling pressure and produce more bars when the market trends strongly in one direction.