Pro Feature. Requires a Pro or Ultra subscription. Get started at api.mathematicalcompany.com

Alpha Research

Horizon ships four pure-Python research modules inspired by Lopez de Prado’s Advances in Financial Machine Learning. Use them standalone for offline analysis or drop their pipeline functions into hz.run() for live monitoring.

Meta-Labeling

Triple-barrier labeling: primary model gives direction, meta-label model decides sizing.

Feature Importance

MDA, SFI, and clustered MDA with purged cross-validation to prevent leakage.

Alpha Decay

Track information coefficient over time, estimate half-life, detect dying edges.

PnL Attribution

Break down returns by market, time period, and factor exposure.

Meta-Labeling (AFML Ch. 3)

A two-model framework. The primary model predicts direction (+1 long, -1 short). The meta-label model then decides whether to act on that signal (1) or abstain (0), using a triple-barrier method: profit-taking, stop-loss, and a vertical (time) barrier. This separation lets you use a high-recall primary model (catches most opportunities) and a high-precision meta-label model (filters out bad trades), which is far more effective than trying to build a single model that does both.

from horizon.meta_label import compute_meta_labels, meta_label_pipeline

compute_meta_labels

Compute meta-labels from primary model signals using triple barriers. For each primary signal, scans forward from the signal index and applies three barriers:

Profit-taking (PT): Return exceeds vol * pt_sl[0] in the direction of the primary signal. Meta-label = 1 (act).
Stop-loss (SL): Return exceeds vol * pt_sl[1] against the primary signal. Meta-label = 0 (abstain).
Vertical barrier: max_holding bars elapse with no barrier hit. Meta-label = 1 if cumulative return > 0, else 0.

labels = compute_meta_labels(
    prices=[100, 101, 102, 99, 98, 103, 105],
    timestamps=[0, 1, 2, 3, 4, 5, 6],
    primary_signals=[(0, 1), (3, -1)],  # (event_index, side)
    pt_sl=(1.0, 1.0),
    max_holding=5,
    vol_span=20,
)

for label in labels:
    print(f"idx={label.event_idx} side={label.primary_side} "
          f"label={label.meta_label} ret={label.ret:.4f} "
          f"conf={label.confidence:.2f}")

Parameters

Parameter	Type	Default	Description
`prices`	`list[float]`	required	Price series (length T).
`timestamps`	`list[float]`	required	Timestamp series (length T, monotonically increasing).
`primary_signals`	`list[tuple[int, int]]`	required	List of `(event_index, side)` where side is +1 (long) or -1 (short).
`pt_sl`	`tuple[float, float]`	`(1.0, 1.0)`	Multipliers for profit-taking and stop-loss barriers, applied to local volatility. `(1.0, 1.0)` = symmetric barriers at 1x vol.
`max_holding`	`int`	`100`	Maximum bars before the vertical barrier fires.
`vol_span`	`int`	`20`	Span for the EWM standard deviation of log returns used to set barrier widths.

Returns

List of MetaLabel objects.

MetaLabel

Field	Type	Description
`event_idx`	`int`	Index into the price series where the primary signal fired.
`primary_side`	`int`	Direction of the primary signal (+1 long, -1 short).
`meta_label`	`int`	1 if the signal was profitable (act), 0 if not (abstain).
`ret`	`float`	Realized return from the primary signal’s perspective.
`confidence`	`float`	Confidence score in [0, 1] based on return magnitude relative to volatility.

meta_label_pipeline

Pipeline function for hz.run(). Reads the primary model’s signal from ctx.params, maintains a rolling buffer of price observations, and injects meta-label decisions.

hz.run(
    pipeline=[
        my_primary_model,         # sets ctx.params["primary_signal"] = +1/-1/0
        meta_label_pipeline(),    # injects meta_label and meta_confidence
        my_quoter,                # uses ctx.params["meta_label"] to gate trades
    ],
    ...
)

Parameters

Parameter	Type	Default	Description
`primary_signal_key`	`str`	`"primary_signal"`	Key in `ctx.params` containing the primary model’s directional signal (+1/-1/0).
`window`	`int`	`50`	Number of recent observations to keep for meta-label computation.

Injected into ctx.params

Key	Type	Description
`"meta_label"`	`int`	0 (abstain) or 1 (act).
`"meta_confidence"`	`float`	Confidence in [0, 1]. Blends latest barrier confidence with rolling hit rate once 5+ labels accumulate.

If the primary signal is 0 or missing, the pipeline passes through with meta_label=0 and meta_confidence=0.0.

Feature Importance (AFML Ch. 8)

Model-agnostic feature importance methods with purged cross-validation. Standard k-fold CV leaks information in time-series data because adjacent samples are correlated. Purged CV removes training samples within a configurable gap of each test fold, preventing look-ahead bias. All methods accept a generic score_fn(X_train, y_train, X_test, y_test) -> float so they work with any model (sklearn, xgboost, a simple function, etc.).

from horizon.feature_importance import (
    mda_importance,
    sfi_importance,
    clustered_mda,
    FeatureImportance,
)

mda_importance

Mean Decrease Accuracy (permutation importance). For each CV fold, computes a baseline test score, then shuffles each feature column individually and re-scores. Importance = mean decrease in score caused by shuffling.

def my_scorer(X_train, y_train, X_test, y_test):
    from collections import Counter
    majority = Counter(y_train).most_common(1)[0][0]
    return sum(1 for y in y_test if y == majority) / len(y_test)

results = mda_importance(
    score_fn=my_scorer,
    X=feature_matrix,
    y=labels,
    feature_names=["momentum", "vol", "spread", "imbalance"],
    n_splits=5,
    purge_gap=10,
    seed=42,
)

for fi in results:
    print(f"{fi.feature}: {fi.importance:.4f} +/- {fi.std:.4f}")

Parameters

Parameter	Type	Default	Description
`score_fn`	`callable`	required	`(X_train, y_train, X_test, y_test) -> float`. Higher = better.
`X`	`list[list[float]]`	required	Feature matrix (N x D). Each inner list is one sample.
`y`	`list[float]`	required	Label vector (length N).
`feature_names`	`list[str]`	required	Names for each feature (length D).
`n_splits`	`int`	`5`	Number of cross-validation folds.
`purge_gap`	`int`	`0`	Number of samples to purge around each test fold.
`seed`	`int or None`	`None`	Random seed for reproducibility.

Returns

List of FeatureImportance sorted by importance descending.

sfi_importance

Single Feature Importance (AFML Ch. 8.6). Trains the model on each feature individually and evaluates via cross-validation. The importance of a feature is its cross-validated score when used as the sole predictor.

results = sfi_importance(
    score_fn=my_scorer,
    X=feature_matrix,
    y=labels,
    feature_names=["momentum", "vol", "spread", "imbalance"],
    n_splits=5,
)

Parameters

Parameter	Type	Default	Description
`score_fn`	`callable`	required	`(X_train, y_train, X_test, y_test) -> float`.
`X`	`list[list[float]]`	required	Feature matrix (N x D).
`y`	`list[float]`	required	Label vector (length N).
`feature_names`	`list[str]`	required	Names for each feature (length D).
`n_splits`	`int`	`5`	Number of cross-validation folds.

Returns

List of FeatureImportance sorted by importance descending.

clustered_mda

Clustered Feature Importance (AFML Ch. 8.7). Groups features by correlation using agglomerative clustering (distance = 1 - |correlation|), then permutes entire clusters at once. When one feature in a correlated group is shuffled, the model can compensate by using the remaining correlated features. Shuffling the entire cluster eliminates this substitution effect, giving a more accurate picture of the group’s true importance.

results = clustered_mda(
    score_fn=my_scorer,
    X=feature_matrix,
    y=labels,
    feature_names=["momentum", "vol", "spread", "imbalance"],
    n_clusters=2,
    n_splits=5,
    seed=42,
)

# Each result.feature contains comma-separated names of features in the cluster
for fi in results:
    print(f"Cluster [{fi.feature}]: {fi.importance:.4f} +/- {fi.std:.4f}")

Parameters

Parameter	Type	Default	Description
`score_fn`	`callable`	required	`(X_train, y_train, X_test, y_test) -> float`.
`X`	`list[list[float]]`	required	Feature matrix (N x D).
`y`	`list[float]`	required	Label vector (length N).
`feature_names`	`list[str]`	required	Names for each feature (length D).
`n_clusters`	`int`	`3`	Number of feature clusters.
`n_splits`	`int`	`5`	Number of cross-validation folds.
`seed`	`int or None`	`None`	Random seed for reproducibility.

Returns

List of FeatureImportance, one per cluster. The feature field contains comma-separated names of features in that cluster. Sorted by importance descending.

FeatureImportance

Field	Type	Description
`feature`	`str`	Feature name (or comma-separated cluster names for `clustered_mda`).
`importance`	`float`	Mean importance score (higher = more important).
`std`	`float`	Standard deviation of importance across folds.

MDA can understate the importance of correlated features. If your feature set has groups of highly correlated predictors (e.g., multiple momentum lookbacks), use clustered_mda instead.

Alpha Decay Tracking

Monitor whether your trading edge is dying. The AlphaDecayTracker computes rolling IC (Spearman rank correlation between predictions and outcomes), estimates half-life via AR(1) fit, and detects negative trend via linear regression on the IC series.

from horizon.alpha_decay import AlphaDecayTracker, alpha_decay_pipeline, AlphaDecayReport

AlphaDecayTracker

Stateful tracker that accumulates predictions and outcomes over time.

tracker = AlphaDecayTracker(window=100, alert_threshold=-0.05)

# Feed new observations
report = tracker.update(
    predictions=[0.6, 0.7, 0.55],
    outcomes=[1.0, 0.0, 1.0],
)

if report and report.is_decaying:
    print(f"Alpha decaying! IC={report.current_ic:.3f}, "
          f"half_life={report.half_life:.1f}")

Constructor Parameters

Parameter	Type	Default	Description
`window`	`int`	`100`	Rolling window size for IC calculation.
`alert_threshold`	`float`	`-0.05`	IC trend slope below which alpha is considered decaying.

update()

Add a new batch of predictions/outcomes and compute the report.

Parameter	Type	Default	Description
`predictions`	`list[float]`	required	Model prediction values (e.g., predicted probabilities).
`outcomes`	`list[float]`	required	Realized outcome values (e.g., 1.0 for win, 0.0 for loss).
`timestamp`	`float or None`	`None`	Observation timestamp. Defaults to current time.

Returns AlphaDecayReport if enough data has accumulated (at least window // 2 observations), None otherwise.

report()

Force compute the current alpha decay state. Returns AlphaDecayReport regardless of data size.

AlphaDecayReport

Field	Type	Description
`current_ic`	`float`	Current Spearman rank correlation between predictions and outcomes.
`rolling_ic`	`list[tuple[float, float]]`	History of `(timestamp, IC)` observations.
`half_life`	`float`	Estimated half-life in observations (AR(1) fit). `inf` if not mean-reverting.
`ic_trend`	`float`	Linear regression slope of the IC series. Negative = decaying.
`is_decaying`	`bool`	`True` if `ic_trend < alert_threshold`.
`rolling_sharpe`	`list[tuple[float, float]]`	History of `(timestamp, Sharpe)` observations.
`sharpe_trend`	`float`	Linear regression slope of the rolling Sharpe series.
`time_to_zero`	`float or None`	Estimated observations until IC reaches 0 (linear extrapolation). `None` if IC is not trending down.

alpha_decay_pipeline

Pipeline function for hz.run(). Uses predictions from ctx.params["predictions"] and outcomes derived from fills to track alpha decay in real time.

hz.run(
    pipeline=[
        alpha_decay_pipeline(window=100, alert_threshold=-0.05),
        my_model,    # sets ctx.params["predictions"]
        my_quoter,
    ],
    ...
)

Parameters

Parameter	Type	Default	Description
`window`	`int`	`100`	Rolling window for IC calculation.
`alert_threshold`	`float`	`-0.05`	IC trend slope below which alpha is considered decaying.

Injected into ctx.params

Key	Type	Description
`"alpha_ic"`	`float`	Current information coefficient.
`"alpha_half_life"`	`float`	Estimated half-life of IC.
`"alpha_decaying"`	`bool`	Whether alpha is currently decaying.

The pipeline logs a warning when is_decaying transitions from False to True, including the current IC, trend slope, and half-life.

PnL Attribution

Break down portfolio PnL by market, time period, and factor exposure to understand where returns come from.

from horizon.pnl_attribution import (
    attribute_pnl,
    attribute_by_time,
    attribute_by_factor,
    pnl_attribution_pipeline,
    AttributionReport,
    PnLBreakdown,
    TimeBreakdown,
    FactorBreakdown,
)

attribute_pnl

Extract positions from an engine and compute per-market PnL breakdown. Results are sorted by absolute PnL descending.

report = attribute_pnl(engine)

print(f"Total PnL: {report.total_pnl:+.2f} across {report.n_positions} positions")
for mkt in report.by_market:
    print(f"  {mkt.market_id}: {mkt.pnl:+.4f} ({mkt.contribution:.1%})")

Parameters

Parameter	Type	Default	Description
`engine`	`Engine`	required	Horizon Engine instance.

Returns

AttributionReport with by_market populated.

attribute_by_time

Group fills by time period and compute PnL per period.

fills = engine.fills()
daily = attribute_by_time(fills, period="daily")

for day in daily:
    print(f"{day.period}: PnL={day.pnl:+.4f}, trades={day.n_trades}, "
          f"win_rate={day.win_rate:.1%}")

Parameters

Parameter	Type	Default	Description
`fills`	`list[Fill]`	required	List of Fill objects (must have `timestamp`, `price`, `size` attributes).
`period`	`str`	`"daily"`	Aggregation period: `"hourly"`, `"daily"`, or `"weekly"`.

Returns

List of TimeBreakdown sorted chronologically.

attribute_by_factor

Factor-based PnL attribution. Maps positions to factors via exposure weights and computes each factor’s PnL contribution and R-squared.

positions = engine.positions()
factor_exposures = {
    "crypto": {"btc-100k": 0.8, "eth-5k": 0.9},
    "politics": {"trump-win": 1.0, "senate-flip": 0.7},
}

factors = attribute_by_factor(positions, factor_exposures)
for f in factors:
    print(f"{f.factor}: exposure={f.exposure:.2f}, "
          f"pnl={f.pnl_contribution:+.4f}, R2={f.r_squared:.3f}")

Parameters

Parameter	Type	Default	Description
`positions`	`list[Position]`	required	List of Position objects.
`factor_exposures`	`dict[str, dict[str, float]]`	required	Mapping of `factor_name -> {market_id: exposure_weight}`.

Returns

List of FactorBreakdown, one per factor.

pnl_attribution_pipeline

Pipeline function for hz.run() that adds attribution data each cycle.

hz.run(
    pipeline=[
        pnl_attribution_pipeline(engine),
        my_model,
        my_quoter,
    ],
    ...
)

Parameters

Parameter	Type	Default	Description
`engine`	`Engine or None`	`None`	Optional engine override. If `None`, uses `ctx.params["engine"]`.

Injected into ctx.params

Key	Type	Description
`"pnl_by_market"`	`list[PnLBreakdown]`	Per-market PnL breakdowns.
`"pnl_top_winner"`	`PnLBreakdown or None`	Market with highest positive PnL.
`"pnl_top_loser"`	`PnLBreakdown or None`	Market with most negative PnL.

Type Reference

AttributionReport

Field	Type	Description
`by_market`	`list[PnLBreakdown]`	Per-market breakdowns, sorted by absolute PnL descending.
`by_time`	`list[TimeBreakdown]`	Per-period breakdowns (populated by `attribute_by_time`).
`by_factor`	`list[FactorBreakdown]`	Per-factor breakdowns (populated by `attribute_by_factor`).
`total_pnl`	`float`	Sum of all position PnLs.
`n_positions`	`int`	Number of positions in the engine.

PnLBreakdown

Field	Type	Description
`market_id`	`str`	Market identifier.
`pnl`	`float`	Realized + unrealized PnL for this market.
`pnl_pct`	`float`	PnL as percentage of cost basis.
`contribution`	`float`	Fraction of total PnL attributable to this market.

TimeBreakdown

Field	Type	Description
`period`	`str`	Time bucket label (e.g., `"2025-01-15"` or `"2025-01-15 14:00"`).
`pnl`	`float`	Net PnL for this period.
`n_trades`	`int`	Number of fills in this period.
`win_rate`	`float`	Fraction of profitable fills in this period.

FactorBreakdown

Field	Type	Description
`factor`	`str`	Factor name.
`exposure`	`float`	Average absolute exposure weight across markets in this factor.
`pnl_contribution`	`float`	Weighted PnL contribution from this factor.
`r_squared`	`float`	Fraction of total PnL variance explained by this factor (capped at 1.0).

Full Research Workflow

Combine all four modules for a complete alpha research loop:

import horizon as hz
from horizon.meta_label import compute_meta_labels
from horizon.feature_importance import mda_importance, clustered_mda
from horizon.alpha_decay import AlphaDecayTracker
from horizon.pnl_attribution import attribute_pnl, attribute_by_time

# 1. Meta-label your primary signals
labels = compute_meta_labels(
    prices=price_series,
    timestamps=ts_series,
    primary_signals=signals,
    pt_sl=(1.5, 1.0),      # Asymmetric: wider profit target
    max_holding=50,
)
hit_rate = sum(1 for lb in labels if lb.meta_label == 1) / len(labels)
print(f"Meta-label hit rate: {hit_rate:.1%}")

# 2. Rank your features
importances = mda_importance(
    score_fn=my_scorer,
    X=features,
    y=[lb.meta_label for lb in labels],
    feature_names=feature_names,
    purge_gap=5,
)
print("Top features:", [fi.feature for fi in importances[:3]])

# 3. Track alpha decay
tracker = AlphaDecayTracker(window=200)
report = tracker.update(predictions, outcomes)
if report and report.is_decaying:
    print(f"Edge decaying: half_life={report.half_life:.0f} obs")

# 4. Attribute PnL
attr = attribute_pnl(engine)
print(f"Top contributor: {attr.by_market[0].market_id} "
      f"({attr.by_market[0].contribution:.1%} of total)")

​Alpha Research

Meta-Labeling

Feature Importance

Alpha Decay

PnL Attribution

​Meta-Labeling (AFML Ch. 3)

​compute_meta_labels

​Parameters

​Returns

​MetaLabel

​meta_label_pipeline

​Parameters

​Injected into ctx.params

​Feature Importance (AFML Ch. 8)

​mda_importance

​Parameters

​Returns

​sfi_importance

​Parameters

​Returns

​clustered_mda

​Parameters

​Returns

​FeatureImportance

​Alpha Decay Tracking

​AlphaDecayTracker

​Constructor Parameters

​update()

​report()

​AlphaDecayReport

​alpha_decay_pipeline

​Parameters

​Injected into ctx.params

​PnL Attribution

​attribute_pnl

​Parameters

​Returns

​attribute_by_time

​Parameters

​Returns

​attribute_by_factor

​Parameters

​Returns

​pnl_attribution_pipeline

​Parameters

​Injected into ctx.params

​Type Reference

​AttributionReport

​PnLBreakdown

​TimeBreakdown

​FactorBreakdown

​Full Research Workflow

Alpha Research

Meta-Labeling (AFML Ch. 3)

compute_meta_labels

Parameters

Returns

MetaLabel

meta_label_pipeline

Parameters

Injected into ctx.params

Feature Importance (AFML Ch. 8)

mda_importance

Parameters

Returns

sfi_importance

Parameters

Returns

clustered_mda

Parameters

Returns

FeatureImportance

Alpha Decay Tracking

AlphaDecayTracker

Constructor Parameters

update()

report()

AlphaDecayReport

alpha_decay_pipeline

Parameters

Injected into ctx.params

PnL Attribution

attribute_pnl

Parameters

Returns

attribute_by_time

Parameters

Returns

attribute_by_factor

Parameters

Returns

pnl_attribution_pipeline

Parameters

Injected into ctx.params

Type Reference

AttributionReport

PnLBreakdown

TimeBreakdown

FactorBreakdown

Full Research Workflow