> ## Documentation Index
> Fetch the complete documentation index at: https://mathematicalcompany.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Copula Dependence

> Bivariate copula fitting, selection, and sampling for modeling prediction market dependence structures.

<Note>
  **Pro Feature.** Requires a Pro or Ultra subscription. [Get started at api.mathematicalcompany.com](https://api.mathematicalcompany.com)
</Note>

<Tip>
  **What is this?** Correlation only captures linear dependence. Copulas model the full dependence structure between two markets - including whether they crash together (tail dependence) or only co-move in normal times. Use copulas to understand and simulate how pairs of prediction markets behave during extreme events.
</Tip>

# Copula Dependence

Horizon implements four bivariate copula families for modeling dependence between prediction markets. Copulas separate the marginal distributions from the dependence structure, allowing you to capture tail dependence, asymmetric co-movement, and non-linear relationships that linear correlation misses. All computation runs in Rust via PyO3.

<CardGroup cols={2}>
  <Card title="4 Copula Families" icon="shapes">
    Gaussian, Clayton, Gumbel, and Frank copulas covering symmetric, lower-tail, upper-tail, and no-tail dependence.
  </Card>

  <Card title="Automatic Selection" icon="magnifying-glass">
    `best_copula()` fits all families and selects the one with highest log-likelihood.
  </Card>

  <Card title="Sampling" icon="dice">
    Generate correlated uniform samples from any fitted copula for Monte Carlo simulation.
  </Card>

  <Card title="Pipeline Integration" icon="diagram-project">
    `hz.copula_dependence()` monitors pairwise dependence in real time within `hz.run()`.
  </Card>
</CardGroup>

***

## Why Copulas?

Linear correlation captures only one dimension of dependence. In prediction markets, outcomes often exhibit:

* **Tail dependence**: two markets crash together more often than they rally together (Clayton copula)
* **Upper tail dependence**: markets resolve YES simultaneously (Gumbel copula)
* **Symmetric dependence**: markets move together uniformly across quantiles (Gaussian copula)
* **No tail dependence**: markets are linked but extreme co-movements are rare (Frank copula)

Copulas model all of these patterns. By fitting the right copula family, you get accurate joint probability estimates for risk management, portfolio construction, and arbitrage detection.

<Note>
  All copula functions operate on uniform marginals (values in \[0, 1]). Use `hz.rank_transform()` to convert raw price series to pseudo-uniform observations before fitting.
</Note>

***

## API

### CopulaFamily Enum

```python theme={null}
import horizon as hz

# Available copula families
hz.CopulaFamily.Gaussian   # symmetric, no tail dependence
hz.CopulaFamily.Clayton    # lower tail dependence (crashes)
hz.CopulaFamily.Gumbel     # upper tail dependence (resolutions)
hz.CopulaFamily.Frank      # symmetric, no tail dependence (lighter tails than Gaussian)
```

| Variant    | Tail Dependence | Parameter Range | Best For                      |
| ---------- | --------------- | --------------- | ----------------------------- |
| `Gaussian` | None            | rho in (-1, 1)  | General symmetric dependence  |
| `Clayton`  | Lower tail      | theta > 0       | Crash co-movement             |
| `Gumbel`   | Upper tail      | theta >= 1      | Joint resolution events       |
| `Frank`    | None            | theta != 0      | Moderate symmetric dependence |

### hz.rank\_transform

Convert raw observation series to pseudo-uniform marginals using the empirical CDF (rank transform).

```python theme={null}
import horizon as hz

prices_a = [0.45, 0.50, 0.48, 0.55, 0.52, 0.60, 0.58]
prices_b = [0.30, 0.35, 0.32, 0.40, 0.38, 0.42, 0.41]

u, v = hz.rank_transform(prices_a, prices_b)
# u and v are lists of floats in (0, 1)
```

| Parameter  | Type          | Description                             |
| ---------- | ------------- | --------------------------------------- |
| `series_a` | `list[float]` | First observation series                |
| `series_b` | `list[float]` | Second observation series (same length) |

Returns `(list[float], list[float])`: pseudo-uniform marginals.

### hz.fit\_copula

Fit a specific copula family to bivariate uniform data using maximum likelihood estimation.

```python theme={null}
import horizon as hz

prices_a = [0.45, 0.50, 0.48, 0.55, 0.52, 0.60, 0.58, 0.62, 0.57, 0.65]
prices_b = [0.30, 0.35, 0.32, 0.40, 0.38, 0.42, 0.41, 0.44, 0.39, 0.46]
u, v = hz.rank_transform(prices_a, prices_b)

fit = hz.fit_copula(u, v, family=hz.CopulaFamily.Clayton)
print(f"Family: {fit.family}")
print(f"Parameter (theta): {fit.parameter:.4f}")
print(f"Log-likelihood: {fit.log_likelihood:.4f}")
print(f"AIC: {fit.aic:.4f}")
```

| Parameter | Type           | Description                                     |
| --------- | -------------- | ----------------------------------------------- |
| `u`       | `list[float]`  | First marginal (values in (0, 1))               |
| `v`       | `list[float]`  | Second marginal (same length, values in (0, 1)) |
| `family`  | `CopulaFamily` | Copula family to fit                            |

Returns a `CopulaFit`.

### hz.best\_copula

Fit all four copula families and return the one with the lowest AIC (best fit).

```python theme={null}
import horizon as hz

prices_a = [0.45, 0.50, 0.48, 0.55, 0.52, 0.60, 0.58, 0.62, 0.57, 0.65]
prices_b = [0.30, 0.35, 0.32, 0.40, 0.38, 0.42, 0.41, 0.44, 0.39, 0.46]
u, v = hz.rank_transform(prices_a, prices_b)

best = hz.best_copula(u, v)
print(f"Best family: {best.family}")
print(f"Parameter: {best.parameter:.4f}")
print(f"AIC: {best.aic:.4f}")
```

| Parameter | Type          | Description                                     |
| --------- | ------------- | ----------------------------------------------- |
| `u`       | `list[float]` | First marginal (values in (0, 1))               |
| `v`       | `list[float]` | Second marginal (same length, values in (0, 1)) |

Returns a `CopulaFit` for the best-fitting family.

### CopulaFit Type

| Field            | Type           | Description                                                     |
| ---------------- | -------------- | --------------------------------------------------------------- |
| `family`         | `CopulaFamily` | The fitted copula family                                        |
| `parameter`      | `float`        | Estimated copula parameter (rho for Gaussian, theta for others) |
| `log_likelihood` | `float`        | Maximized log-likelihood                                        |
| `aic`            | `float`        | Akaike Information Criterion (lower is better)                  |
| `kendall_tau`    | `float`        | Implied Kendall's tau from the fitted parameter                 |

### hz.copula\_sample

Generate correlated samples from a fitted copula.

```python theme={null}
import horizon as hz

fit = hz.best_copula(u, v)

samples = hz.copula_sample(
    family=fit.family,
    parameter=fit.parameter,
    n=10000,
    seed=42,
)
# samples is a list of (u_i, v_i) tuples in [0, 1]^2
print(f"Generated {len(samples)} correlated samples")
```

| Parameter   | Type           | Description                                |
| ----------- | -------------- | ------------------------------------------ |
| `family`    | `CopulaFamily` | Copula family                              |
| `parameter` | `float`        | Copula parameter                           |
| `n`         | `int`          | Number of samples to generate              |
| `seed`      | `int`          | Random seed for reproducibility (optional) |

Returns `list[(float, float)]`.

### hz.copula\_cdf

Evaluate the copula CDF at a given point (u, v).

```python theme={null}
import horizon as hz

# Joint probability that both markets are below their 50th percentile
p = hz.copula_cdf(
    u=0.5,
    v=0.5,
    family=hz.CopulaFamily.Clayton,
    parameter=2.0,
)
print(f"C(0.5, 0.5) = {p:.4f}")
# Under independence this would be 0.25; Clayton with theta=2 gives higher value
# reflecting lower-tail dependence
```

| Parameter   | Type           | Description                      |
| ----------- | -------------- | -------------------------------- |
| `u`         | `float`        | First marginal value in \[0, 1]  |
| `v`         | `float`        | Second marginal value in \[0, 1] |
| `family`    | `CopulaFamily` | Copula family                    |
| `parameter` | `float`        | Copula parameter                 |

Returns `float`: the copula CDF value C(u, v).

***

## Pipeline Integration

### hz.copula\_dependence

Pipeline function that fits copulas between market pairs each cycle and injects results into `ctx.params["copula"]`.

```python theme={null}
import horizon as hz

def dependence_strategy(ctx):
    copula = ctx.params.get("copula")
    if copula is None:
        return []

    mid = ctx.feed.price

    # Check tail dependence with correlated markets
    tau = copula.get("kendall_tau", 0.0)
    family = copula.get("family", "Gaussian")

    # If strong lower-tail dependence (Clayton), widen spread for crash risk
    if family == "Clayton" and tau > 0.5:
        spread = 0.06
    else:
        spread = 0.03

    return [
        hz.quote(ctx, hz.Side.Yes, hz.OrderSide.Buy, mid - spread, 10),
        hz.quote(ctx, hz.Side.Yes, hz.OrderSide.Sell, mid + spread, 10),
    ]

hz.run(
    name="copula-mm",
    markets=["0xmarket_a...", "0xmarket_b..."],
    pipeline=[
        hz.copula_dependence(),
        dependence_strategy,
    ],
    interval=5.0,
)
```

The `ctx.params["copula"]` dict contains:

| Key              | Type    | Description                    |
| ---------------- | ------- | ------------------------------ |
| `family`         | `str`   | Best-fit copula family name    |
| `parameter`      | `float` | Fitted copula parameter        |
| `kendall_tau`    | `float` | Implied Kendall's tau          |
| `aic`            | `float` | AIC of the best-fit model      |
| `log_likelihood` | `float` | Log-likelihood of the best fit |

***

## Mathematical Background

<AccordionGroup>
  <Accordion title="Sklar's Theorem">
    Every joint distribution F(x, y) can be decomposed as F(x, y) = C(F\_X(x), F\_Y(y)) where C is a copula and F\_X, F\_Y are the marginal CDFs. The copula C captures all dependence information independent of the marginals. This separation allows modeling marginals and dependence separately.
  </Accordion>

  <Accordion title="Copula Families">
    **Gaussian**: C(u,v) is derived from the bivariate normal distribution with correlation rho. No tail dependence regardless of correlation strength.

    **Clayton**: C(u,v) = (u^(-theta) + v^(-theta) - 1)^(-1/theta). Lower tail dependence coefficient = 2^(-1/theta). Suitable for modeling crash co-movement.

    **Gumbel**: C(u,v) = exp(-((-log u)^theta + (-log v)^theta)^(1/theta)). Upper tail dependence coefficient = 2 - 2^(1/theta). Suitable for modeling joint resolution events.

    **Frank**: C(u,v) involves the debye function. No tail dependence (similar to Gaussian) but with different dependence structure in the body of the distribution.
  </Accordion>

  <Accordion title="Model Selection">
    `best_copula()` fits all four families by maximum likelihood and selects the one with the lowest AIC (Akaike Information Criterion). AIC = -2 \* log\_likelihood + 2 \* k, where k is the number of parameters (always 1 for these bivariate copulas). Lower AIC indicates a better balance of fit and parsimony.
  </Accordion>

  <Accordion title="Kendall's Tau">
    Each copula family has a closed-form relationship between its parameter and Kendall's tau:

    * Gaussian: tau = (2/pi) \* arcsin(rho)
    * Clayton: tau = theta / (theta + 2)
    * Gumbel: tau = 1 - 1/theta
    * Frank: tau = 1 - 4/theta \* (1 - D\_1(theta)) where D\_1 is the first Debye function

    Kendall's tau provides a copula-invariant measure of dependence strength, making it easier to compare across families.
  </Accordion>
</AccordionGroup>
