> ## Documentation Index
> Fetch the complete documentation index at: https://mathematicalcompany.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Elastic Net Selection

> Elastic net, lasso, and ridge regression for feature selection and signal construction in prediction market trading pipelines.

<Note>
  **Pro Feature.** Requires a Pro or Ultra subscription. [Get started at api.mathematicalcompany.com](https://api.mathematicalcompany.com)
</Note>

<Tip>
  **What is this?** When you have many potential trading signals, elastic net regression automatically selects the ones that actually predict returns and discards the rest. It's a regularized regression that balances between LASSO (which selects few signals) and ridge (which shrinks all signals). Use it to build parsimonious alpha models that don't overfit.
</Tip>

# Elastic Net Selection

Horizon provides Rust-native regularized regression (elastic net, lasso, ridge) for feature selection and signal construction. These methods identify which market signals carry predictive power and automatically shrink or eliminate noisy features, producing sparse, interpretable models suitable for real-time trading.

<CardGroup cols={2}>
  <Card title="Elastic Net" icon="scale-balanced">
    Combined L1 + L2 regularization. Balances feature selection (lasso) with coefficient stability (ridge).
  </Card>

  <Card title="Lasso (L1)" icon="scissors">
    Pure L1 regularization. Drives uninformative coefficients to exactly zero for automatic feature selection.
  </Card>

  <Card title="Ridge (L2)" icon="mountain">
    Pure L2 regularization. Shrinks all coefficients toward zero without eliminating any. Handles multicollinearity.
  </Card>

  <Card title="Cross-Validation" icon="arrows-rotate">
    Automated alpha/lambda selection via k-fold CV. Finds the regularization strength that minimizes out-of-sample error.
  </Card>
</CardGroup>

***

## hz.elastic\_net\_fit

Fit an elastic net regression model with combined L1 and L2 penalties. The objective minimizes:

`(1/2n) * ||y - X*beta||^2 + alpha * (l1_ratio * ||beta||_1 + 0.5 * (1-l1_ratio) * ||beta||_2^2)`

```python theme={null}
import horizon as hz

result = hz.elastic_net_fit(
    features=[
        [0.5, 0.3, 0.1],
        [0.6, 0.2, 0.4],
        [0.4, 0.5, 0.2],
        [0.7, 0.1, 0.3],
        [0.3, 0.4, 0.5],
    ],
    targets=[0.55, 0.60, 0.50, 0.65, 0.45],
    alpha=0.1,
    l1_ratio=0.5,
    max_iters=1000,
)

print(f"Coefficients: {result.coefficients}")
print(f"Intercept: {result.intercept:.4f}")
print(f"Non-zero features: {result.n_nonzero}")
print(f"R-squared: {result.r_squared:.4f}")
```

| Parameter   | Type                | Default  | Description                                                    |
| ----------- | ------------------- | -------- | -------------------------------------------------------------- |
| `features`  | `list[list[float]]` | required | Feature matrix (n\_samples x n\_features)                      |
| `targets`   | `list[float]`       | required | Target values (n\_samples)                                     |
| `alpha`     | `float`             | `0.1`    | Overall regularization strength. Higher = more regularization. |
| `l1_ratio`  | `float`             | `0.5`    | Mix of L1 vs L2 penalty. 1.0 = pure lasso, 0.0 = pure ridge.   |
| `max_iters` | `int`               | `1000`   | Maximum coordinate descent iterations                          |
| `tol`       | `float`             | `1e-6`   | Convergence tolerance                                          |

### ElasticNetResult Type

| Field               | Type          | Description                                         |
| ------------------- | ------------- | --------------------------------------------------- |
| `coefficients`      | `list[float]` | Fitted coefficients (one per feature)               |
| `intercept`         | `float`       | Fitted intercept term                               |
| `n_nonzero`         | `int`         | Number of non-zero coefficients (selected features) |
| `r_squared`         | `float`       | In-sample R-squared (coefficient of determination)  |
| `mse`               | `float`       | Mean squared error on training data                 |
| `selected_features` | `list[int]`   | Indices of features with non-zero coefficients      |

***

## hz.elastic\_net\_predict

Generate predictions from a fitted elastic net model.

```python theme={null}
import horizon as hz

# Fit model
result = hz.elastic_net_fit(features_train, targets_train, alpha=0.1, l1_ratio=0.5)

# Predict on new data
predictions = hz.elastic_net_predict(
    features=features_test,
    coefficients=result.coefficients,
    intercept=result.intercept,
)

for pred, actual in zip(predictions, targets_test):
    print(f"  predicted={pred:.4f}, actual={actual:.4f}")
```

| Parameter      | Type                | Description                                              |
| -------------- | ------------------- | -------------------------------------------------------- |
| `features`     | `list[list[float]]` | Feature matrix for prediction (n\_samples x n\_features) |
| `coefficients` | `list[float]`       | Fitted coefficients from elastic\_net\_fit               |
| `intercept`    | `float`             | Fitted intercept from elastic\_net\_fit                  |

Returns `list[float]`: predicted values.

***

## hz.lasso\_fit

Convenience function for pure L1 regularization (elastic net with l1\_ratio=1.0). Drives uninformative coefficients to exactly zero.

```python theme={null}
import horizon as hz

result = hz.lasso_fit(
    features=feature_matrix,
    targets=target_series,
    alpha=0.05,
)

print(f"Selected {result.n_nonzero} of {len(result.coefficients)} features")
for idx in result.selected_features:
    print(f"  Feature {idx}: coeff={result.coefficients[idx]:.4f}")
```

| Parameter   | Type                | Default  | Description                               |
| ----------- | ------------------- | -------- | ----------------------------------------- |
| `features`  | `list[list[float]]` | required | Feature matrix (n\_samples x n\_features) |
| `targets`   | `list[float]`       | required | Target values                             |
| `alpha`     | `float`             | `0.1`    | Regularization strength                   |
| `max_iters` | `int`               | `1000`   | Maximum iterations                        |
| `tol`       | `float`             | `1e-6`   | Convergence tolerance                     |

Returns an `ElasticNetResult` (same type as elastic\_net\_fit).

***

## hz.ridge\_fit

Convenience function for pure L2 regularization (elastic net with l1\_ratio=0.0). Shrinks all coefficients toward zero without eliminating any. Preferred when all features are potentially relevant and multicollinearity is present.

```python theme={null}
import horizon as hz

result = hz.ridge_fit(
    features=feature_matrix,
    targets=target_series,
    alpha=1.0,
)

print(f"Coefficients: {result.coefficients}")
print(f"R-squared: {result.r_squared:.4f}")
```

| Parameter   | Type                | Default  | Description             |
| ----------- | ------------------- | -------- | ----------------------- |
| `features`  | `list[list[float]]` | required | Feature matrix          |
| `targets`   | `list[float]`       | required | Target values           |
| `alpha`     | `float`             | `1.0`    | Regularization strength |
| `max_iters` | `int`               | `1000`   | Maximum iterations      |
| `tol`       | `float`             | `1e-6`   | Convergence tolerance   |

Returns an `ElasticNetResult`.

<Note>
  Ridge regression always keeps all features (n\_nonzero equals the total number of features). Use lasso or elastic net when you need automatic feature selection.
</Note>

***

## hz.elastic\_net\_cv

Automated regularization parameter selection using k-fold cross-validation. Tests a grid of alpha values and returns the model with the lowest out-of-sample MSE.

```python theme={null}
import horizon as hz

cv_result = hz.elastic_net_cv(
    features=feature_matrix,
    targets=target_series,
    l1_ratio=0.5,
    n_alphas=50,
    n_folds=5,
)

print(f"Best alpha: {cv_result.best_alpha:.6f}")
print(f"Best CV MSE: {cv_result.best_mse:.6f}")
print(f"Non-zero at best alpha: {cv_result.best_model.n_nonzero}")

# Use the best model directly
predictions = hz.elastic_net_predict(
    features=features_test,
    coefficients=cv_result.best_model.coefficients,
    intercept=cv_result.best_model.intercept,
)
```

| Parameter   | Type                | Default  | Description                                 |
| ----------- | ------------------- | -------- | ------------------------------------------- |
| `features`  | `list[list[float]]` | required | Feature matrix                              |
| `targets`   | `list[float]`       | required | Target values                               |
| `l1_ratio`  | `float`             | `0.5`    | L1/L2 mix ratio                             |
| `n_alphas`  | `int`               | `50`     | Number of alpha values to test (log-spaced) |
| `n_folds`   | `int`               | `5`      | Number of cross-validation folds            |
| `max_iters` | `int`               | `1000`   | Maximum iterations per fit                  |

### CV Result Type

| Field        | Type               | Description                                |
| ------------ | ------------------ | ------------------------------------------ |
| `best_alpha` | `float`            | Alpha with lowest cross-validation MSE     |
| `best_mse`   | `float`            | Cross-validation MSE at the best alpha     |
| `best_model` | `ElasticNetResult` | Full model fitted at the best alpha        |
| `alphas`     | `list[float]`      | All alpha values tested                    |
| `cv_mses`    | `list[float]`      | Mean CV MSE at each alpha                  |
| `cv_stds`    | `list[float]`      | Standard deviation of CV MSE at each alpha |

***

## Pipeline Integration

### hz.signal\_selector

Creates a pipeline function that uses elastic net to select and weight signals from multiple feeds, injecting a composite signal into `ctx.params`.

```python theme={null}
import horizon as hz

def composite_quoter(ctx):
    signal = ctx.params.get("signal")
    if signal is None:
        return []

    composite_fair = signal["fair_value"]
    confidence = signal["r_squared"]

    # Scale position size by model confidence
    size = 10 if confidence > 0.3 else 3
    spread = 0.04 if confidence > 0.3 else 0.06

    return hz.quotes(fair=composite_fair, spread=spread, size=size)

hz.run(
    name="signal_selector",
    markets=["election-winner"],
    feeds={
        "poly": hz.PolymarketBook("election-token"),
        "kalshi": hz.KalshiBook("kalshi-event-id"),
        "oracle": hz.ChainlinkFeed("0xabc..."),
    },
    pipeline=[
        hz.signal_selector(
            target_feed="poly",
            signal_feeds=["kalshi", "oracle"],
            lookback=200,
            l1_ratio=0.7,
            retrain_interval=100,
        ),
        composite_quoter,
    ],
)
```

| Parameter          | Type        | Default    | Description                                       |
| ------------------ | ----------- | ---------- | ------------------------------------------------- |
| `target_feed`      | `str`       | required   | Feed representing the target variable             |
| `signal_feeds`     | `list[str]` | required   | Feeds to use as features                          |
| `lookback`         | `int`       | `200`      | Observations to retain for training               |
| `l1_ratio`         | `float`     | `0.7`      | L1/L2 mix ratio (higher = more feature selection) |
| `retrain_interval` | `int`       | `100`      | Retrain the model every N observations            |
| `n_folds`          | `int`       | `5`        | Cross-validation folds for alpha selection        |
| `param_name`       | `str`       | `"signal"` | Key in ctx.params                                 |

### Injected Parameters

| Key                                  | Type               | Description                                |
| ------------------------------------ | ------------------ | ------------------------------------------ |
| `ctx.params["signal"]["fair_value"]` | `float`            | Model-predicted fair value                 |
| `ctx.params["signal"]["r_squared"]`  | `float`            | In-sample R-squared of the current model   |
| `ctx.params["signal"]["n_signals"]`  | `int`              | Number of non-zero (selected) signal feeds |
| `ctx.params["signal"]["weights"]`    | `dict[str, float]` | Feed name to coefficient mapping           |

***

## Example: Feature Selection Workflow

```python theme={null}
import horizon as hz

# Build feature matrix from multiple signal sources
# Each row is one observation; each column is a signal
features = []
targets = []

for t in range(len(prices_target)):
    row = [
        prices_polymarket[t],
        prices_kalshi[t],
        sentiment_score[t],
        volume_ratio[t],
        spread_signal[t],
    ]
    features.append(row)
    targets.append(prices_target[t + 1])  # predict next price

# Cross-validated elastic net
cv = hz.elastic_net_cv(
    features=features,
    targets=targets,
    l1_ratio=0.7,
    n_alphas=50,
    n_folds=5,
)

print(f"Best alpha: {cv.best_alpha:.6f}")
print(f"CV MSE: {cv.best_mse:.6f}")
print(f"Selected features: {cv.best_model.selected_features}")

# Inspect which signals matter
signal_names = ["polymarket", "kalshi", "sentiment", "volume", "spread"]
for idx in cv.best_model.selected_features:
    coeff = cv.best_model.coefficients[idx]
    print(f"  {signal_names[idx]}: {coeff:+.4f}")
```

***

## Mathematical Background

<AccordionGroup>
  <Accordion title="Elastic Net Objective">
    The elastic net minimizes:

    L(beta) = (1/2n) \* ||y - X\*beta||^2 + alpha \* \[l1\_ratio \* ||beta||\_1 + 0.5 \* (1 - l1\_ratio) \* ||beta||\_2^2]

    When l1\_ratio=1, this is the lasso (L1 only). When l1\_ratio=0, this is ridge regression (L2 only). The L1 term produces sparsity (feature selection); the L2 term handles correlated features and improves numerical stability.
  </Accordion>

  <Accordion title="Coordinate Descent">
    Horizon solves the elastic net using cyclic coordinate descent. For each feature j, the update is:

    beta\_j = soft\_threshold(partial\_residual\_j, alpha \* l1\_ratio) / (1 + alpha \* (1 - l1\_ratio))

    where soft\_threshold(z, gamma) = sign(z) \* max(|z| - gamma, 0). The algorithm cycles through all features until convergence.
  </Accordion>

  <Accordion title="Cross-Validation for Alpha Selection">
    The regularization strength alpha controls the bias-variance tradeoff. K-fold CV splits the data into K folds, trains on K-1, and evaluates on the held-out fold. The alpha with the lowest average MSE across folds is selected. The alpha grid is log-spaced from alpha\_max (where all coefficients are zero) down to alpha\_max / 1000.
  </Accordion>
</AccordionGroup>

<Warning>
  Features should be standardized (zero mean, unit variance) before fitting elastic net. Unstandardized features with different scales will cause the regularization to penalize large-scale features disproportionately.
</Warning>
