Skip to main content
Pro Feature. Requires a Pro or Ultra subscription. Get started at api.mathematicalcompany.com
What is this? Correlation only captures linear dependence. Copulas model the full dependence structure between two markets - including whether they crash together (tail dependence) or only co-move in normal times. Use copulas to understand and simulate how pairs of prediction markets behave during extreme events.

Copula Dependence

Horizon implements four bivariate copula families for modeling dependence between prediction markets. Copulas separate the marginal distributions from the dependence structure, allowing you to capture tail dependence, asymmetric co-movement, and non-linear relationships that linear correlation misses. All computation runs in Rust via PyO3.

4 Copula Families

Gaussian, Clayton, Gumbel, and Frank copulas covering symmetric, lower-tail, upper-tail, and no-tail dependence.

Automatic Selection

best_copula() fits all families and selects the one with highest log-likelihood.

Sampling

Generate correlated uniform samples from any fitted copula for Monte Carlo simulation.

Pipeline Integration

hz.copula_dependence() monitors pairwise dependence in real time within hz.run().

Why Copulas?

Linear correlation captures only one dimension of dependence. In prediction markets, outcomes often exhibit:
  • Tail dependence: two markets crash together more often than they rally together (Clayton copula)
  • Upper tail dependence: markets resolve YES simultaneously (Gumbel copula)
  • Symmetric dependence: markets move together uniformly across quantiles (Gaussian copula)
  • No tail dependence: markets are linked but extreme co-movements are rare (Frank copula)
Copulas model all of these patterns. By fitting the right copula family, you get accurate joint probability estimates for risk management, portfolio construction, and arbitrage detection.
All copula functions operate on uniform marginals (values in [0, 1]). Use hz.rank_transform() to convert raw price series to pseudo-uniform observations before fitting.

API

CopulaFamily Enum

import horizon as hz

# Available copula families
hz.CopulaFamily.Gaussian   # symmetric, no tail dependence
hz.CopulaFamily.Clayton    # lower tail dependence (crashes)
hz.CopulaFamily.Gumbel     # upper tail dependence (resolutions)
hz.CopulaFamily.Frank      # symmetric, no tail dependence (lighter tails than Gaussian)
VariantTail DependenceParameter RangeBest For
GaussianNonerho in (-1, 1)General symmetric dependence
ClaytonLower tailtheta > 0Crash co-movement
GumbelUpper tailtheta >= 1Joint resolution events
FrankNonetheta != 0Moderate symmetric dependence

hz.rank_transform

Convert raw observation series to pseudo-uniform marginals using the empirical CDF (rank transform).
import horizon as hz

prices_a = [0.45, 0.50, 0.48, 0.55, 0.52, 0.60, 0.58]
prices_b = [0.30, 0.35, 0.32, 0.40, 0.38, 0.42, 0.41]

u, v = hz.rank_transform(prices_a, prices_b)
# u and v are lists of floats in (0, 1)
ParameterTypeDescription
series_alist[float]First observation series
series_blist[float]Second observation series (same length)
Returns (list[float], list[float]): pseudo-uniform marginals.

hz.fit_copula

Fit a specific copula family to bivariate uniform data using maximum likelihood estimation.
import horizon as hz

prices_a = [0.45, 0.50, 0.48, 0.55, 0.52, 0.60, 0.58, 0.62, 0.57, 0.65]
prices_b = [0.30, 0.35, 0.32, 0.40, 0.38, 0.42, 0.41, 0.44, 0.39, 0.46]
u, v = hz.rank_transform(prices_a, prices_b)

fit = hz.fit_copula(u, v, family=hz.CopulaFamily.Clayton)
print(f"Family: {fit.family}")
print(f"Parameter (theta): {fit.parameter:.4f}")
print(f"Log-likelihood: {fit.log_likelihood:.4f}")
print(f"AIC: {fit.aic:.4f}")
ParameterTypeDescription
ulist[float]First marginal (values in (0, 1))
vlist[float]Second marginal (same length, values in (0, 1))
familyCopulaFamilyCopula family to fit
Returns a CopulaFit.

hz.best_copula

Fit all four copula families and return the one with the lowest AIC (best fit).
import horizon as hz

prices_a = [0.45, 0.50, 0.48, 0.55, 0.52, 0.60, 0.58, 0.62, 0.57, 0.65]
prices_b = [0.30, 0.35, 0.32, 0.40, 0.38, 0.42, 0.41, 0.44, 0.39, 0.46]
u, v = hz.rank_transform(prices_a, prices_b)

best = hz.best_copula(u, v)
print(f"Best family: {best.family}")
print(f"Parameter: {best.parameter:.4f}")
print(f"AIC: {best.aic:.4f}")
ParameterTypeDescription
ulist[float]First marginal (values in (0, 1))
vlist[float]Second marginal (same length, values in (0, 1))
Returns a CopulaFit for the best-fitting family.

CopulaFit Type

FieldTypeDescription
familyCopulaFamilyThe fitted copula family
parameterfloatEstimated copula parameter (rho for Gaussian, theta for others)
log_likelihoodfloatMaximized log-likelihood
aicfloatAkaike Information Criterion (lower is better)
kendall_taufloatImplied Kendall’s tau from the fitted parameter

hz.copula_sample

Generate correlated samples from a fitted copula.
import horizon as hz

fit = hz.best_copula(u, v)

samples = hz.copula_sample(
    family=fit.family,
    parameter=fit.parameter,
    n=10000,
    seed=42,
)
# samples is a list of (u_i, v_i) tuples in [0, 1]^2
print(f"Generated {len(samples)} correlated samples")
ParameterTypeDescription
familyCopulaFamilyCopula family
parameterfloatCopula parameter
nintNumber of samples to generate
seedintRandom seed for reproducibility (optional)
Returns list[(float, float)].

hz.copula_cdf

Evaluate the copula CDF at a given point (u, v).
import horizon as hz

# Joint probability that both markets are below their 50th percentile
p = hz.copula_cdf(
    u=0.5,
    v=0.5,
    family=hz.CopulaFamily.Clayton,
    parameter=2.0,
)
print(f"C(0.5, 0.5) = {p:.4f}")
# Under independence this would be 0.25; Clayton with theta=2 gives higher value
# reflecting lower-tail dependence
ParameterTypeDescription
ufloatFirst marginal value in [0, 1]
vfloatSecond marginal value in [0, 1]
familyCopulaFamilyCopula family
parameterfloatCopula parameter
Returns float: the copula CDF value C(u, v).

Pipeline Integration

hz.copula_dependence

Pipeline function that fits copulas between market pairs each cycle and injects results into ctx.params["copula"].
import horizon as hz

def dependence_strategy(ctx):
    copula = ctx.params.get("copula")
    if copula is None:
        return []

    mid = ctx.feed.price

    # Check tail dependence with correlated markets
    tau = copula.get("kendall_tau", 0.0)
    family = copula.get("family", "Gaussian")

    # If strong lower-tail dependence (Clayton), widen spread for crash risk
    if family == "Clayton" and tau > 0.5:
        spread = 0.06
    else:
        spread = 0.03

    return [
        hz.quote(ctx, hz.Side.Yes, hz.OrderSide.Buy, mid - spread, 10),
        hz.quote(ctx, hz.Side.Yes, hz.OrderSide.Sell, mid + spread, 10),
    ]

hz.run(
    name="copula-mm",
    markets=["0xmarket_a...", "0xmarket_b..."],
    pipeline=[
        hz.copula_dependence(),
        dependence_strategy,
    ],
    interval=5.0,
)
The ctx.params["copula"] dict contains:
KeyTypeDescription
familystrBest-fit copula family name
parameterfloatFitted copula parameter
kendall_taufloatImplied Kendall’s tau
aicfloatAIC of the best-fit model
log_likelihoodfloatLog-likelihood of the best fit

Mathematical Background

Every joint distribution F(x, y) can be decomposed as F(x, y) = C(F_X(x), F_Y(y)) where C is a copula and F_X, F_Y are the marginal CDFs. The copula C captures all dependence information independent of the marginals. This separation allows modeling marginals and dependence separately.
Gaussian: C(u,v) is derived from the bivariate normal distribution with correlation rho. No tail dependence regardless of correlation strength.Clayton: C(u,v) = (u^(-theta) + v^(-theta) - 1)^(-1/theta). Lower tail dependence coefficient = 2^(-1/theta). Suitable for modeling crash co-movement.Gumbel: C(u,v) = exp(-((-log u)^theta + (-log v)^theta)^(1/theta)). Upper tail dependence coefficient = 2 - 2^(1/theta). Suitable for modeling joint resolution events.Frank: C(u,v) involves the debye function. No tail dependence (similar to Gaussian) but with different dependence structure in the body of the distribution.
best_copula() fits all four families by maximum likelihood and selects the one with the lowest AIC (Akaike Information Criterion). AIC = -2 * log_likelihood + 2 * k, where k is the number of parameters (always 1 for these bivariate copulas). Lower AIC indicates a better balance of fit and parsimony.
Each copula family has a closed-form relationship between its parameter and Kendall’s tau:
  • Gaussian: tau = (2/pi) * arcsin(rho)
  • Clayton: tau = theta / (theta + 2)
  • Gumbel: tau = 1 - 1/theta
  • Frank: tau = 1 - 4/theta * (1 - D_1(theta)) where D_1 is the first Debye function
Kendall’s tau provides a copula-invariant measure of dependence strength, making it easier to compare across families.