> ## Documentation Index
> Fetch the complete documentation index at: https://mathematicalcompany.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Lead-Lag Networks

> Detect lead-lag relationships between prediction markets using Hayashi-Yoshida correlation, cross-correlation analysis, Granger causality, and network construction.

<Note>
  **Pro Feature.** Requires a Pro or Ultra subscription. [Get started at api.mathematicalcompany.com](https://api.mathematicalcompany.com)
</Note>

<Tip>
  **What is this?** Some markets move before others. Lead-lag detection finds these relationships - for example, a Polymarket contract that consistently reacts to news 30 seconds before a Kalshi contract on the same event. Use it to build cross-market signals or identify which feeds to watch for early information.
</Tip>

# Lead-Lag Networks

Horizon provides Rust-native tools for detecting which prediction markets lead or lag others. These relationships are fundamental to cross-market arbitrage, informed order routing, and signal extraction from related markets.

<CardGroup cols={2}>
  <Card title="Hayashi-Yoshida" icon="timeline">
    Asynchronous tick-by-tick correlation that handles irregular timestamps without synchronization bias.
  </Card>

  <Card title="Cross-Correlation" icon="chart-line">
    Multi-lag cross-correlation analysis to identify optimal lead/lag offsets.
  </Card>

  <Card title="Granger Causality" icon="arrow-right">
    Statistical test for whether past values of one market predict another.
  </Card>

  <Card title="Lead-Lag Network" icon="share-nodes">
    Build a directed network of lead-lag relationships across many markets.
  </Card>
</CardGroup>

***

## hz.hayashi\_yoshida

Compute the Hayashi-Yoshida estimator for asynchronous correlation between two tick streams. Unlike standard Pearson correlation, this does not require synchronized timestamps and avoids the Epps effect (correlation attenuation from data aggregation).

```python theme={null}
import horizon as hz

result = hz.hayashi_yoshida(
    timestamps_x=[1.0, 2.5, 3.0, 5.0, 7.0],
    prices_x=[0.50, 0.52, 0.51, 0.55, 0.54],
    timestamps_y=[1.5, 2.0, 4.0, 6.0, 7.5],
    prices_y=[0.60, 0.62, 0.61, 0.65, 0.64],
)

print(f"Correlation: {result.correlation:.4f}")
print(f"Covariance: {result.covariance:.6f}")
print(f"N overlaps: {result.n_overlaps}")
```

| Parameter      | Type          | Description                  |
| -------------- | ------------- | ---------------------------- |
| `timestamps_x` | `list[float]` | Tick timestamps for market X |
| `prices_x`     | `list[float]` | Tick prices for market X     |
| `timestamps_y` | `list[float]` | Tick timestamps for market Y |
| `prices_y`     | `list[float]` | Tick prices for market Y     |

### LeadLagResult Type

| Field         | Type    | Description                                    |
| ------------- | ------- | ---------------------------------------------- |
| `correlation` | `float` | Hayashi-Yoshida correlation estimate (-1 to 1) |
| `covariance`  | `float` | Hayashi-Yoshida covariance estimate            |
| `n_overlaps`  | `int`   | Number of overlapping return intervals used    |

***

## hz.cross\_correlation\_lags

Compute cross-correlation between two synchronized series at multiple lags to identify the optimal lead-lag offset.

```python theme={null}
import horizon as hz

result = hz.cross_correlation_lags(
    series_x=returns_a,
    series_y=returns_b,
    max_lag=10,
)

print(f"Best lag: {result.best_lag}")
print(f"Best correlation: {result.best_correlation:.4f}")
for lag, corr in zip(result.lags, result.correlations):
    print(f"  lag={lag:+3d}: corr={corr:.4f}")
```

| Parameter  | Type          | Default  | Description                                         |
| ---------- | ------------- | -------- | --------------------------------------------------- |
| `series_x` | `list[float]` | required | First time series (returns or prices)               |
| `series_y` | `list[float]` | required | Second time series (same length as series\_x)       |
| `max_lag`  | `int`         | `10`     | Maximum lag to compute (both positive and negative) |

### Cross-Correlation Result Type

| Field              | Type          | Description                               |
| ------------------ | ------------- | ----------------------------------------- |
| `lags`             | `list[int]`   | Lag values from -max\_lag to +max\_lag    |
| `correlations`     | `list[float]` | Correlation at each lag                   |
| `best_lag`         | `int`         | Lag with the highest absolute correlation |
| `best_correlation` | `float`       | Correlation value at the best lag         |

<Note>
  A positive `best_lag` means series\_x leads series\_y by that many observations. A negative `best_lag` means series\_y leads series\_x.
</Note>

***

## hz.granger\_causality

Test whether past values of series X help predict series Y beyond what series Y's own past values provide. Uses an F-test on a bivariate VAR model.

```python theme={null}
import horizon as hz

result = hz.granger_causality(
    series_x=returns_a,
    series_y=returns_b,
    max_lag=5,
)

print(f"F-statistic: {result.f_statistic:.4f}")
print(f"p-value: {result.p_value:.6f}")
print(f"Optimal lag: {result.optimal_lag}")
if result.p_value < 0.05:
    print("X Granger-causes Y at 5% significance")
```

| Parameter  | Type          | Default  | Description                         |
| ---------- | ------------- | -------- | ----------------------------------- |
| `series_x` | `list[float]` | required | Potential causal series             |
| `series_y` | `list[float]` | required | Potential effect series             |
| `max_lag`  | `int`         | `5`      | Maximum lag order for the VAR model |

### GrangerResult Type

| Field            | Type    | Description                                                              |
| ---------------- | ------- | ------------------------------------------------------------------------ |
| `f_statistic`    | `float` | F-test statistic for the null hypothesis that X does not Granger-cause Y |
| `p_value`        | `float` | p-value of the F-test (lower = stronger evidence of causality)           |
| `optimal_lag`    | `int`   | Lag order selected by the test                                           |
| `is_significant` | `bool`  | True if `p_value < 0.05`                                                 |

***

## hz.lead\_lag\_network

Build a directed network of lead-lag relationships across multiple markets. Computes pairwise Hayashi-Yoshida correlations at multiple lags and constructs a graph where edges point from leaders to followers.

```python theme={null}
import horizon as hz

network = hz.lead_lag_network(
    market_ids=["trump-win", "harris-win", "senate-gop", "house-gop"],
    timestamps=[ts_trump, ts_harris, ts_senate, ts_house],
    prices=[px_trump, px_harris, px_senate, px_house],
    max_lag=5,
    min_correlation=0.3,
)

print(f"Nodes: {len(network.nodes)}")
print(f"Edges: {len(network.edges)}")
for edge in network.edges:
    print(f"  {edge.source} -> {edge.target}: "
          f"lag={edge.lag}, corr={edge.correlation:.3f}")

# Identify market leaders (most outgoing edges)
for node in network.nodes:
    print(f"  {node.market_id}: out_degree={node.out_degree}, "
          f"in_degree={node.in_degree}")
```

| Parameter         | Type                | Default  | Description                                    |
| ----------------- | ------------------- | -------- | ---------------------------------------------- |
| `market_ids`      | `list[str]`         | required | Market identifiers                             |
| `timestamps`      | `list[list[float]]` | required | Tick timestamps for each market                |
| `prices`          | `list[list[float]]` | required | Tick prices for each market                    |
| `max_lag`         | `int`               | `5`      | Maximum lag to test in cross-correlation       |
| `min_correlation` | `float`             | `0.3`    | Minimum absolute correlation to create an edge |

### LeadLagNetwork Type

| Field       | Type                | Description                                    |
| ----------- | ------------------- | ---------------------------------------------- |
| `nodes`     | `list[GraphNode]`   | Market nodes with degree statistics            |
| `edges`     | `list[GraphEdge]`   | Directed edges from leaders to followers       |
| `adjacency` | `list[list[float]]` | Weighted adjacency matrix (correlation values) |

### GraphNode Type

| Field              | Type    | Description                              |
| ------------------ | ------- | ---------------------------------------- |
| `market_id`        | `str`   | Market identifier                        |
| `out_degree`       | `int`   | Number of markets this one leads         |
| `in_degree`        | `int`   | Number of markets that lead this one     |
| `leadership_score` | `float` | out\_degree / (in\_degree + out\_degree) |

### GraphEdge Type

| Field         | Type    | Description                                  |
| ------------- | ------- | -------------------------------------------- |
| `source`      | `str`   | Leader market ID                             |
| `target`      | `str`   | Follower market ID                           |
| `lag`         | `int`   | Optimal lag (in ticks) from source to target |
| `correlation` | `float` | Correlation strength at the optimal lag      |

***

## Pipeline Integration

### hz.lead\_lag\_detector

Creates a pipeline function that tracks lead-lag relationships between feeds and injects leadership scores into `ctx.params`.

```python theme={null}
import horizon as hz

def follow_leader(ctx):
    ll = ctx.params.get("lead_lag")
    if ll is None:
        return []

    # If our market is a follower, use the leader's signal
    leader_price = ll.get("leader_price")
    if leader_price is not None:
        lag = ll["lag"]
        # Leader moved up -- anticipate follower moving up
        if leader_price > ctx.feed.price + 0.02:
            return hz.order(side="buy", price=ctx.feed.price, size=10)
    return []

hz.run(
    name="lead_lag_trader",
    markets=["senate-gop"],
    feeds={
        "target": hz.PolymarketBook("senate-gop-token"),
        "leader": hz.PolymarketBook("trump-win-token"),
    },
    pipeline=[
        hz.lead_lag_detector(
            feed_x="leader",
            feed_y="target",
            lookback=500,
            max_lag=10,
        ),
        follow_leader,
    ],
)
```

| Parameter         | Type    | Default      | Description                                         |
| ----------------- | ------- | ------------ | --------------------------------------------------- |
| `feed_x`          | `str`   | required     | Potential leader feed                               |
| `feed_y`          | `str`   | required     | Potential follower feed                             |
| `lookback`        | `int`   | `500`        | Number of observations to retain for correlation    |
| `max_lag`         | `int`   | `10`         | Maximum lag to scan                                 |
| `min_correlation` | `float` | `0.3`        | Minimum correlation to flag a lead-lag relationship |
| `param_name`      | `str`   | `"lead_lag"` | Key in ctx.params                                   |

***

## Mathematical Background

<AccordionGroup>
  <Accordion title="Hayashi-Yoshida Estimator">
    The HY estimator computes the realized covariance between two asynchronously observed processes. For each pair of return intervals that overlap in time, it accumulates the product of the returns. This avoids synchronization (e.g., previous-tick interpolation) which attenuates correlation at high frequencies (the Epps effect).

    HY\_cov = sum over overlapping intervals (delta\_X\_i \* delta\_Y\_j)
  </Accordion>

  <Accordion title="Granger Causality">
    X Granger-causes Y if past values of X improve the prediction of Y beyond what past values of Y alone provide. The test compares two VAR regressions:

    * Restricted: `Y_t = c + sum(a_i * Y_{t-i})`
    * Unrestricted: `Y_t = c + sum(a_i * Y_{t-i}) + sum(b_i * X_{t-i})`

    An F-test on the incremental explanatory power of X determines significance.
  </Accordion>

  <Accordion title="Lead-Lag in Prediction Markets">
    In prediction markets, lead-lag relationships arise because:

    1. **Correlated events**: Related markets (e.g., presidency and Senate) share underlying information
    2. **Liquidity differences**: More liquid markets incorporate information faster
    3. **Attention asymmetry**: High-profile markets attract faster traders

    Detecting these relationships enables cross-market signal extraction and early positioning.
  </Accordion>
</AccordionGroup>
