> ## Documentation Index
> Fetch the complete documentation index at: https://mathematicalcompany.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Graph Analysis

> Correlation graphs, minimum spanning trees, community detection, and contagion risk analysis for prediction market networks.

<Note>
  **Pro Feature.** Requires a Pro or Ultra subscription. [Get started at api.mathematicalcompany.com](https://api.mathematicalcompany.com)
</Note>

<Tip>
  **What is this?** When you trade dozens of correlated prediction markets, graph analysis reveals the hidden structure. It builds a network from correlations, finds clusters of related markets, identifies the most central 'hub' markets, and simulates how a shock in one market propagates to others. Use it to understand portfolio risk and find diversification opportunities.
</Tip>

# Graph Analysis

Horizon provides Rust-native graph algorithms for analyzing the structure of prediction market networks. Build correlation graphs, extract minimum spanning trees, detect market communities, and measure contagion risk across interconnected markets.

<CardGroup cols={2}>
  <Card title="Correlation Graph" icon="circle-nodes">
    Build a weighted graph from pairwise market correlations. Filter by threshold to reveal significant relationships.
  </Card>

  <Card title="Minimum Spanning Tree" icon="sitemap">
    Extract the MST to find the most important connections with minimal redundancy.
  </Card>

  <Card title="Community Detection" icon="object-group">
    Identify clusters of related markets using modularity-based community detection.
  </Card>

  <Card title="Contagion Risk" icon="virus">
    Measure how shocks in one market propagate through the network.
  </Card>
</CardGroup>

***

## hz.build\_correlation\_graph

Build a weighted undirected graph from pairwise correlations between market return series. Edges are created between markets whose absolute correlation exceeds the threshold.

```python theme={null}
import horizon as hz

graph = hz.build_correlation_graph(
    market_ids=["trump-win", "harris-win", "senate-gop", "house-gop", "fed-cut"],
    returns=[returns_trump, returns_harris, returns_senate, returns_house, returns_fed],
    threshold=0.2,
)

print(f"Nodes: {len(graph.nodes)}")
print(f"Edges: {len(graph.edges)}")
print(f"Density: {graph.density:.3f}")

for edge in graph.edges:
    print(f"  {edge.source} <-> {edge.target}: {edge.weight:.3f}")
```

| Parameter    | Type                | Default  | Description                                     |
| ------------ | ------------------- | -------- | ----------------------------------------------- |
| `market_ids` | `list[str]`         | required | Market identifiers (one per series)             |
| `returns`    | `list[list[float]]` | required | Return series for each market (all same length) |
| `threshold`  | `float`             | `0.2`    | Minimum absolute correlation to create an edge  |

### MarketGraph Type

| Field       | Type                | Description                                 |
| ----------- | ------------------- | ------------------------------------------- |
| `nodes`     | `list[GraphNode]`   | Market nodes with centrality statistics     |
| `edges`     | `list[GraphEdge]`   | Undirected edges with correlation weights   |
| `adjacency` | `list[list[float]]` | Full correlation matrix                     |
| `density`   | `float`             | Graph density: edges / max\_possible\_edges |

### GraphNode Type

| Field        | Type    | Description                     |
| ------------ | ------- | ------------------------------- |
| `market_id`  | `str`   | Market identifier               |
| `degree`     | `int`   | Number of connected edges       |
| `strength`   | `float` | Sum of absolute edge weights    |
| `centrality` | `float` | Betweenness centrality (0 to 1) |

### GraphEdge Type

| Field      | Type    | Description                                   |
| ---------- | ------- | --------------------------------------------- |
| `source`   | `str`   | First market ID                               |
| `target`   | `str`   | Second market ID                              |
| `weight`   | `float` | Correlation value (can be negative)           |
| `distance` | `float` | Correlation distance: sqrt(0.5 \* (1 - corr)) |

***

## hz.minimum\_spanning\_tree

Extract the minimum spanning tree (MST) from a market graph using Kruskal's algorithm on correlation distances. The MST connects all markets with the minimum total distance, revealing the backbone structure of the correlation network.

```python theme={null}
import horizon as hz

graph = hz.build_correlation_graph(
    market_ids=market_ids,
    returns=all_returns,
    threshold=0.0,  # Include all edges for MST
)

mst = hz.minimum_spanning_tree(graph)

print(f"MST edges: {len(mst.edges)}")  # Always N-1 for N nodes
for edge in mst.edges:
    print(f"  {edge.source} -- {edge.target}: "
          f"corr={edge.weight:.3f}, dist={edge.distance:.3f}")

# Identify the most central market in the MST
hub = max(mst.nodes, key=lambda n: n.degree)
print(f"Hub market: {hub.market_id} (degree={hub.degree})")
```

| Parameter | Type          | Description                                   |
| --------- | ------------- | --------------------------------------------- |
| `graph`   | `MarketGraph` | A market graph from `build_correlation_graph` |

Returns a `MarketGraph` containing only the MST edges (N-1 edges for N nodes).

<Note>
  The MST uses correlation distance (smaller = more correlated) as edge weights. Highly correlated markets are connected first. The resulting tree reveals the hierarchical structure of market relationships without cycles.
</Note>

***

## hz.detect\_communities

Detect communities (clusters) of related markets using the Louvain modularity optimization algorithm. Markets within a community are more correlated with each other than with markets in other communities.

```python theme={null}
import horizon as hz

graph = hz.build_correlation_graph(
    market_ids=market_ids,
    returns=all_returns,
    threshold=0.2,
)

communities = hz.detect_communities(graph, resolution=1.0)

for comm in communities:
    print(f"Community {comm.id}: {comm.members}")
    print(f"  Internal density: {comm.internal_density:.3f}")
    print(f"  Size: {comm.size}")
```

| Parameter    | Type          | Default  | Description                                                                                |
| ------------ | ------------- | -------- | ------------------------------------------------------------------------------------------ |
| `graph`      | `MarketGraph` | required | Market correlation graph                                                                   |
| `resolution` | `float`       | `1.0`    | Resolution parameter. Higher = more smaller communities, lower = fewer larger communities. |

### Community Type

| Field                     | Type        | Description                                       |
| ------------------------- | ----------- | ------------------------------------------------- |
| `id`                      | `int`       | Community index                                   |
| `members`                 | `list[str]` | Market IDs in this community                      |
| `size`                    | `int`       | Number of markets in the community                |
| `internal_density`        | `float`     | Fraction of possible internal edges that exist    |
| `modularity_contribution` | `float`     | This community's contribution to total modularity |

***

## hz.contagion\_risk

Measure how a shock to one market propagates through the correlation network. Simulates the impact of a large move in a source market on all connected markets using the correlation structure.

```python theme={null}
import horizon as hz

graph = hz.build_correlation_graph(
    market_ids=market_ids,
    returns=all_returns,
    threshold=0.2,
)

risk = hz.contagion_risk(
    graph=graph,
    source_market="trump-win",
    shock_size=0.10,         # 10-cent shock
    decay=0.5,               # Attenuation per hop
)

print(f"Source: {risk.source}")
print(f"Total network impact: {risk.total_impact:.4f}")
for market_id, impact in risk.impacts.items():
    print(f"  {market_id}: expected move = {impact:.4f}")
```

| Parameter       | Type          | Default  | Description                                 |
| --------------- | ------------- | -------- | ------------------------------------------- |
| `graph`         | `MarketGraph` | required | Market correlation graph                    |
| `source_market` | `str`         | required | Market ID where the shock originates        |
| `shock_size`    | `float`       | `0.10`   | Magnitude of the initial shock              |
| `decay`         | `float`       | `0.5`    | Attenuation factor per network hop (0 to 1) |

### Contagion Result Type

| Field               | Type               | Description                             |
| ------------------- | ------------------ | --------------------------------------- |
| `source`            | `str`              | Source market ID                        |
| `impacts`           | `dict[str, float]` | Expected impact on each market          |
| `total_impact`      | `float`            | Sum of all impacts across the network   |
| `max_impact_market` | `str`              | Market with the largest expected impact |
| `max_impact`        | `float`            | Largest single-market impact            |

***

## Pipeline Integration

### hz.market\_network

Creates a pipeline function that builds and updates a market correlation graph from multiple feeds. Injects network statistics into `ctx.params`.

```python theme={null}
import horizon as hz

def network_aware_quoter(ctx):
    net = ctx.params.get("network")
    if net is None:
        return []

    centrality = net.get("centrality", 0.0)
    community_size = net.get("community_size", 1)

    # Markets with high centrality and large communities
    # are more connected -- use tighter spreads
    if centrality > 0.5 and community_size > 3:
        spread = 0.03
    else:
        spread = 0.05

    return hz.quotes(fair=ctx.feed.price, spread=spread, size=5)

hz.run(
    name="network_mm",
    markets=["trump-win"],
    feeds={
        "trump": hz.PolymarketBook("trump-token"),
        "harris": hz.PolymarketBook("harris-token"),
        "senate": hz.PolymarketBook("senate-token"),
        "house": hz.PolymarketBook("house-token"),
    },
    pipeline=[
        hz.market_network(
            feeds=["trump", "harris", "senate", "house"],
            lookback=500,
            threshold=0.2,
            target_feed="trump",
        ),
        network_aware_quoter,
    ],
)
```

| Parameter     | Type        | Default     | Description                                        |
| ------------- | ----------- | ----------- | -------------------------------------------------- |
| `feeds`       | `list[str]` | required    | Feed names to include in the network               |
| `lookback`    | `int`       | `500`       | Number of observations to retain per feed          |
| `threshold`   | `float`     | `0.2`       | Minimum correlation for graph edges                |
| `target_feed` | `str`       | `None`      | Feed to compute centrality for. None = first feed. |
| `param_name`  | `str`       | `"network"` | Key in ctx.params                                  |

### Injected Parameters

| Key                                       | Type    | Description                            |
| ----------------------------------------- | ------- | -------------------------------------- |
| `ctx.params["network"]["centrality"]`     | `float` | Target market's betweenness centrality |
| `ctx.params["network"]["degree"]`         | `int`   | Target market's edge count             |
| `ctx.params["network"]["community_size"]` | `int`   | Size of the target market's community  |
| `ctx.params["network"]["density"]`        | `float` | Overall graph density                  |
| `ctx.params["network"]["n_communities"]`  | `int`   | Total number of detected communities   |

***

## Example: Full Network Analysis

```python theme={null}
import horizon as hz

# Build correlation graph from historical returns
market_ids = ["trump", "harris", "senate", "house", "fed"]
returns = [ret_trump, ret_harris, ret_senate, ret_house, ret_fed]

graph = hz.build_correlation_graph(market_ids, returns, threshold=0.15)

# Extract MST backbone
mst = hz.minimum_spanning_tree(graph)
print("MST structure:")
for e in mst.edges:
    print(f"  {e.source} -- {e.target} (corr={e.weight:.3f})")

# Detect communities
communities = hz.detect_communities(graph, resolution=1.0)
for c in communities:
    print(f"Community {c.id}: {c.members}")

# Contagion analysis
risk = hz.contagion_risk(graph, source_market="trump", shock_size=0.15)
print(f"\nContagion from trump shock:")
for mkt, impact in sorted(risk.impacts.items(), key=lambda x: -x[1]):
    print(f"  {mkt}: {impact:.4f}")
```

***

## Mathematical Background

<AccordionGroup>
  <Accordion title="Correlation Distance">
    The correlation distance d(i,j) = sqrt(0.5 \* (1 - corr(i,j))) maps correlations to a proper metric space. Perfectly correlated assets have distance 0; uncorrelated assets have distance sqrt(0.5) \~ 0.707; perfectly anti-correlated assets have distance 1. This metric satisfies the triangle inequality.
  </Accordion>

  <Accordion title="Louvain Community Detection">
    The Louvain algorithm optimizes modularity Q = sum over communities \[e\_c - (a\_c)^2], where e\_c is the fraction of edges within community c and a\_c is the fraction of edge endpoints in c. The algorithm iteratively moves nodes between communities to maximize Q, then aggregates communities and repeats.
  </Accordion>

  <Accordion title="Contagion Propagation">
    Contagion risk is modeled as a breadth-first propagation through the correlation graph. At each hop from the source, the shock is attenuated by the decay factor and scaled by the edge correlation. Markets reachable through multiple paths receive the maximum impact from any single path.
  </Accordion>
</AccordionGroup>
