Pro Feature. Requires a Pro or Ultra subscription. Get started at api.mathematicalcompany.com
Fractional Differentiation
Horizon implements fractional differentiation from Chapter 5 of Marcos Lopez de Prado’s Advances in Financial Machine Learning. All functions run in Rust for maximum performance and are exposed to Python via PyO3.FFD (Recommended)
Fixed-width window fractional differentiation. Constant lag count per output point. Suitable for modeling.
Expanding Window
Full-memory fractional differentiation. Preserves all history but uses variable lag counts.
ADF Test
Simplified Augmented Dickey-Fuller statistic for stationarity verification.
Minimum d Search
Automatically find the smallest differentiation order that achieves stationarity.
Why Fractional Differentiation?
Integer differencing is the standard tool for making time series stationary:- d = 0 (no differencing): preserves all memory but the series is non-stationary
- d = 1 (first difference): achieves stationarity but destroys long-range memory
The key insight from AFML Ch. 5: there exists a minimum d* (typically 0.2 to 0.6 for financial prices) that makes the series just barely stationary. Using d* instead of d = 1 preserves substantially more predictive signal for downstream ML models.
API
hz.frac_diff_weights
Compute the fractional differentiation weights for orderd. These weights follow the recursion w_k = -w_(k-1) * (d - k + 1) / k, starting with w_0 = 1. Generation stops when |w_k| < threshold.
| Parameter | Type | Default | Description |
|---|---|---|---|
d | float | required | Differentiation order (typically 0 to 1) |
threshold | float | 1e-5 | Minimum absolute weight to include |
list[float] of weights.
hz.frac_diff_ffd
Fixed-Width Window Fractional Differentiation (FFD): the recommended method from AFML Ch. 5.4. Computes weights viafrac_diff_weights(d, threshold) and applies them as a convolution over the series. Every output point uses the same number of lags, making the resulting series suitable for modeling.
| Parameter | Type | Default | Description |
|---|---|---|---|
series | list[float] | required | Input price (or log-price) series |
d | float | required | Differentiation order (non-negative) |
threshold | float | 1e-5 | Weight truncation threshold (positive) |
list[float] of length len(series) - len(weights) + 1. The output is shorter than the input because the first entries lack enough history for the full weight window.
hz.frac_diff_expanding
Expanding-window (full-memory) fractional differentiation. At each point t, uses all weights from lag 0 to lag t. This preserves the full information content of the original series but produces a non-stationary weight structure.| Parameter | Type | Description |
|---|---|---|
series | list[float] | Input price series (non-empty) |
d | float | Differentiation order (non-negative) |
list[float] of the same length as the input.
Expanding window is O(n^2) vs O(n * w_len) for FFD. Use FFD for production and expanding window for analysis where you need full-length output.
hz.adf_statistic
Simplified Augmented Dickey-Fuller test statistic (no augmenting lags). Fits the regression delta_y[t] = alpha + beta * y[t-1] + epsilon[t] and returns ADF stat = beta / SE(beta). More negative values indicate stronger stationarity evidence.| Parameter | Type | Description |
|---|---|---|
series | list[float] | Input series (at least 3 observations, all finite) |
float: the ADF test statistic.
hz.min_frac_diff
Find the minimum differentiation orderd that makes the series stationary (AFML Ch. 5.5).
Searches d from 0 to max_d in n_steps equal increments. For each d, applies frac_diff_ffd, then computes the ADF test statistic. Returns the smallest d whose ADF stat is below the 5% critical value (-2.862).
| Parameter | Type | Default | Description |
|---|---|---|---|
series | list[float] | required | Price series (at least 10 observations) |
p_threshold | float | 0.05 | Reserved for future p-value based stopping |
max_d | float | 1.0 | Upper bound on d search range |
n_steps | int | 20 | Number of grid points between 0 and max_d |
weight_threshold | float | 1e-5 | Threshold for FFD weight truncation |
(float, list[(float, float)]): the optimal d and a list of (d, ADF statistic) scan results. If no d in the range achieves stationarity, optimal_d is set to max_d.
Workflow
The typical workflow for fractional differentiation:Comparing d Values
Using with Information-Driven Bars
Combine fractional differentiation with information-driven bars for a complete AFML pipeline:Mathematical Background
Weight Recursion
Weight Recursion
The fractional differentiation operator of order d is defined by the binomial series:
(1 - B)^d = sum(w_k * B^k, k=0..inf)where B is the backshift operator and the weights follow:w_0 = 1w_k = -w_(k-1) * (d - k + 1) / k
FFD vs Expanding Window
FFD vs Expanding Window
The expanding window method applies all weights from lag 0 to lag t at each point t. This preserves the full information content but means early and late points use different numbers of lags, making the series non-stationary in its construction.The fixed-width window (FFD) method truncates weights below a threshold, fixing the window width. Every output point uses the same number of lags, producing a consistently constructed series. The trade-off is losing the first
len(weights) - 1 observations.FFD is preferred for production use because:- Consistent lag structure across all output points
- Faster computation: O(n * w_len) vs O(n^2)
- The truncated weights are negligibly small
ADF Test
ADF Test
The Augmented Dickey-Fuller test checks the null hypothesis that a series has a unit root (is non-stationary). The test fits:delta_y[t] = alpha + beta * y[t-1] + epsilon[t]The ADF statistic is beta / SE(beta). More negative values provide stronger evidence against the unit root hypothesis. The 5% critical value is approximately -2.862 for series with >100 observations.Horizon implements the simplified version without augmenting lags, which is sufficient for the
min_frac_diff search where the goal is finding the stationarity threshold rather than precise p-values.