Skip to main content
Pro Feature. Requires a Pro or Ultra subscription. Get started at api.mathematicalcompany.com
What is this? In tick data, sampling too frequently inflates volatility estimates (microstructure noise) while sampling too infrequently loses information. The volatility signature plot shows you this tradeoff and finds the optimal sampling frequency. Use it to calibrate your volatility estimates and detect when market microstructure noise is unusually high.

Volatility Signature

Horizon provides Rust-native tools for analyzing microstructure noise in prediction market tick data. The volatility signature plot reveals how realized volatility estimates vary with sampling frequency, enabling you to separate true volatility from market noise and find the optimal sampling interval.

Signature Plot

Realized volatility at multiple sampling frequencies. Visualize the noise-to-signal transition.

Two-Scale Realized Vol

Zhang-Mykland-Ait-Sahalia estimator. Bias-corrected volatility robust to microstructure noise.

Noise Variance

Estimate the variance of microstructure noise from the autocovariance of returns.

Optimal Sampling

Find the sampling frequency that minimizes total estimation error (bias + variance).

hz.volatility_signature

Compute realized volatility at multiple sampling frequencies to produce a volatility signature plot. At very high frequencies, microstructure noise inflates the estimate. At very low frequencies, estimation variance increases. The signature plot reveals both effects.
import horizon as hz

result = hz.volatility_signature(
    prices=[0.50, 0.51, 0.49, 0.52, 0.50, 0.53, 0.51, 0.54, 0.52, 0.55],
    timestamps=[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0],
    max_interval=5,
    annualize=True,
)

print(f"Sampling frequencies: {result.intervals}")
print(f"Realized volatilities: {result.volatilities}")
print(f"Flat region starts at interval: {result.optimal_interval}")

for interval, vol in zip(result.intervals, result.volatilities):
    print(f"  interval={interval}: RV={vol:.4f}")
ParameterTypeDefaultDescription
priceslist[float]requiredTick prices
timestampslist[float]requiredTick timestamps (same length as prices)
max_intervalintNoneMaximum sampling interval to test. None = len(prices) / 4.
annualizeboolTrueMultiply by sqrt(365)

SignaturePlot Type

FieldTypeDescription
intervalslist[int]Sampling intervals tested (1, 2, 3, …, max_interval)
volatilitieslist[float]Realized volatility at each sampling interval
optimal_intervalintInterval where the signature plot flattens (noise becomes negligible)
noise_ratiofloatEstimated ratio of noise variance to total variance at interval=1

hz.two_scale_realized_vol

The Zhang-Mykland-Ait-Sahalia (2005) Two-Scale Realized Volatility (TSRV) estimator. Combines a fast-scale (tick-by-tick) and slow-scale (subsampled) estimator to cancel out microstructure noise bias.
import horizon as hz

tsrv = hz.two_scale_realized_vol(
    prices=tick_prices,
    n_slow=5,           # Slow-scale subsampling factor
    annualize=True,
)

print(f"TSRV: {tsrv:.4f}")
ParameterTypeDefaultDescription
priceslist[float]requiredTick prices (at least 10 observations)
n_slowint5Slow-scale subsampling factor. Higher = more noise cancellation, less efficiency.
annualizeboolTrueMultiply by sqrt(365)
Returns float: the bias-corrected realized volatility estimate.
TSRV is the recommended estimator when you suspect microstructure noise in your tick data. It converges at rate n^(-1/6) even in the presence of noise, versus n^(0) for standard realized volatility (which does not converge at all when noise is present).

hz.noise_variance_estimate

Estimate the variance of microstructure noise from the first-order autocovariance of high-frequency returns. Under standard noise models, the noise variance equals the negative of the first autocovariance.
import horizon as hz

noise_var = hz.noise_variance_estimate(prices=tick_prices)
print(f"Noise variance: {noise_var:.8f}")
print(f"Noise std dev: {noise_var**0.5:.6f}")
ParameterTypeDescription
priceslist[float]Tick prices (at least 3 observations)
Returns float: estimated noise variance. Returns 0.0 if the autocovariance is positive (indicating no noise or trending behavior).

hz.optimal_sampling_frequency

Find the sampling frequency that minimizes the total mean squared error of the realized volatility estimator. This balances the bias from microstructure noise (dominant at high frequencies) against the estimation variance (dominant at low frequencies).
import horizon as hz

result = hz.optimal_sampling_frequency(
    prices=tick_prices,
    timestamps=tick_timestamps,
)

print(f"Optimal interval: {result} ticks")
print(f"Recommended: sample every {result} observations for RV estimation")
ParameterTypeDescription
priceslist[float]Tick prices
timestampslist[float]Tick timestamps
Returns int: the optimal number of ticks between samples for realized volatility estimation.
The optimal frequency depends on the noise-to-signal ratio. Noisier markets (e.g., illiquid prediction markets with wide spreads) require lower sampling frequencies. The formula follows Bandi-Russell (2008): n_opt ~ (noise_var / integrated_quarticity)^(1/3) * T^(2/3).

Pipeline Integration

hz.vol_signature_analyzer

Creates a pipeline function that computes the volatility signature and TSRV from a feed and injects microstructure statistics into ctx.params.
import horizon as hz

def noise_aware_quoter(ctx):
    vol_sig = ctx.params.get("vol_sig")
    if vol_sig is None:
        return []

    true_vol = vol_sig["tsrv"]
    noise_ratio = vol_sig["noise_ratio"]

    # If noise dominates, widen spread to avoid adverse selection
    if noise_ratio > 0.5:
        spread = true_vol * 4
    else:
        spread = true_vol * 2

    return hz.quotes(fair=ctx.feed.price, spread=max(spread, 0.02), size=5)

hz.run(
    name="noise_aware_mm",
    markets=["election-winner"],
    feeds={"book": hz.PolymarketBook("election-winner")},
    pipeline=[
        hz.vol_signature_analyzer(
            feed="book",
            lookback=500,
            max_interval=20,
        ),
        noise_aware_quoter,
    ],
    risk=hz.Risk(max_position=100),
)
ParameterTypeDefaultDescription
feedstrNoneFeed name to read prices from. None = first available.
lookbackint500Number of ticks to retain for analysis
max_intervalint20Maximum sampling interval for signature plot
n_slowint5Slow-scale factor for TSRV
annualizeboolTrueAnnualize volatility estimates
param_namestr"vol_sig"Key in ctx.params

Injected Parameters

KeyTypeDescription
ctx.params["vol_sig"]["tsrv"]floatTwo-scale realized volatility estimate
ctx.params["vol_sig"]["noise_variance"]floatEstimated microstructure noise variance
ctx.params["vol_sig"]["noise_ratio"]floatNoise variance / total variance at tick frequency
ctx.params["vol_sig"]["optimal_interval"]intOptimal sampling interval
ctx.params["vol_sig"]["rv_tick"]floatStandard realized vol at tick frequency (noise-contaminated)

Example: Microstructure Analysis

import horizon as hz

# Load tick data
prices = [...]       # Tick prices
timestamps = [...]   # Tick timestamps

# 1. Volatility signature plot
sig = hz.volatility_signature(prices, timestamps, max_interval=30)
print("Signature plot:")
for interval, vol in zip(sig.intervals, sig.volatilities):
    bar = "#" * int(vol * 200)
    print(f"  {interval:3d} | {vol:.4f} | {bar}")

# 2. Noise-robust volatility estimate
tsrv = hz.two_scale_realized_vol(prices, n_slow=sig.optimal_interval)
rv_naive = hz.estimate_volatility(prices, annualize=True)
print(f"\nNaive RV (tick): {rv_naive:.4f}")
print(f"TSRV:           {tsrv:.4f}")
print(f"Noise inflation: {(rv_naive / tsrv - 1) * 100:.1f}%")

# 3. Noise statistics
noise_var = hz.noise_variance_estimate(prices)
print(f"\nNoise variance: {noise_var:.8f}")
print(f"Noise std dev:  {noise_var**0.5:.6f}")

# 4. Optimal sampling
opt = hz.optimal_sampling_frequency(prices, timestamps)
print(f"Optimal sampling: every {opt} ticks")

Mathematical Background

The signature plot computes realized volatility RV(delta) = sum of squared returns at sampling interval delta. In the absence of noise, RV(delta) is approximately constant for all delta. In the presence of microstructure noise, RV(delta) is inflated at small delta (high frequency) due to the noise term 2 * n * noise_var, where n is the number of returns. The plot should flatten as delta increases past the noise-dominated region.
TSRV combines two estimators:
  • Fast scale (all ticks): RV_fast = sum(r_i^2), biased upward by noise
  • Slow scale (subsampled): RV_slow = (1/K) * sum over K subgrids of subsampled RV
TSRV = RV_slow - (n_bar / n) * RV_fast, where n_bar is the average subsample size. The noise terms cancel, yielding a consistent estimator.
Under the model: observed_price = true_price + noise, where noise is i.i.d., the first-order autocovariance of returns equals negative noise_var. This follows because noise creates negative serial correlation in returns (a positive noise shock is partially reversed in the next return). The estimator: noise_var = -Cov(r_t, r_(t+1)).
The volatility signature requires at least 20 tick observations to produce meaningful results. For markets with very few daily trades, consider accumulating data over multiple days before running the analysis.