Skip to content

Log-Marginal Likelihood Landscape module

landscape

Log-marginal likelihood landscape computation and grid search.

Handles visualization of the hyperparameter space and identification of optimal kernel configurations through grid search.

compute_lml_landscape(delta_values, log_k, sigma_range=None, length_scale_range=None, n_sigma=50, n_length=50, noise_level=None, posterior_cov=None, nbins=None, k_start=None, k_end=None, auto_sigma_factor=3.0, auto_length_fallback=0.3, validate_ranges=True, kernel_type=KernelType.RBF, kernel_params=None, pre_optimize_extra_params=True, grid_config=None, binning=None, verbose=True)

Compute log-marginal likelihood landscape in \((\sigma, \ell)\) hyperparameter space.

This visualizes how well different GP models explain the data, allowing us to:

  1. Test whether signal variance \(\sigma\) is significantly non-zero (evidence for features)
  2. Infer characteristic length scale \(\ell\) of features (sharp vs smooth)
  3. Assess parameter uncertainty and degeneracies (ridge structures)
  4. Compute Bayes factors for model comparison (signal vs noise)

Supports multiple kernel types and intelligent automatic hyperparameter range selection.

PARAMETER DESCRIPTION
delta_values

Observed \(\delta(k)\) values (single sample or posterior mean), shape (n,).

TYPE: ndarray

log_k

\(\log(k)\) values, shape (n, 1) or (n,).

TYPE: ndarray

sigma_range

(min, max) for signal std \(\sigma\) (None = auto-determine from data).

TYPE: Optional[Tuple[float, float]] DEFAULT: None

length_scale_range

(min, max) for length scale \(\ell\) (None = auto from resolution).

TYPE: Optional[Tuple[float, float]] DEFAULT: None

n_sigma

Number of \(\sigma\) grid points.

TYPE: int DEFAULT: 50

n_length

Number of length scale \(\ell\) grid points.

TYPE: int DEFAULT: 50

noise_level

Fixed diagonal noise \(\sigma_n^2\) (used if posterior_cov=None).

TYPE: Optional[float] DEFAULT: None

posterior_cov

Full posterior covariance matrix \(\Sigma_{\mathrm{post}}\) (RECOMMENDED, nbins x nbins).

TYPE: Optional[ndarray] DEFAULT: None

nbins

Number of bins for automatic length scale range (recommended).

TYPE: Optional[int] DEFAULT: None

k_start

Start of \(k\)-range in \(\mathrm{Mpc}^{-1}\) for automatic length scale range.

TYPE: Optional[float] DEFAULT: None

k_end

End of \(k\)-range in \(\mathrm{Mpc}^{-1}\) for automatic length scale range.

TYPE: Optional[float] DEFAULT: None

auto_sigma_factor

Multiplier for empirical_std when auto-determining sigma_max.

TYPE: float DEFAULT: 3.0

auto_length_fallback

Fraction of log_k range for length scale estimation fallback.

TYPE: float DEFAULT: 0.3

validate_ranges

If True, validate user-provided ranges and warn if unreasonable.

TYPE: bool DEFAULT: True

kernel_type

Type of kernel to use (default: KernelType.RBF).

TYPE: KernelType DEFAULT: RBF

kernel_params

Fixed kernel-specific parameters (not explored in landscape).

TYPE: Optional[Dict[str, Any]] DEFAULT: None

pre_optimize_extra_params

If True, pre-optimize extra kernel hyperparameters.

TYPE: bool DEFAULT: True

grid_config

Optional GridSearchConfig object (supersedes other parameters if provided).

TYPE: Optional[GridSearchConfig] DEFAULT: None

binning

Optional BinningScheme object (supersedes nbins, k_start, k_end if provided).

TYPE: Optional[BinningScheme] DEFAULT: None

verbose

Whether to print results.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
Dict

Dictionary containing: - sigma_grid: 1D array of \(\sigma\) values - length_scale_grid: 1D array of \(\ell\) values - lml_grid: 2D array of log-marginal likelihoods (n_sigma, n_length) - optimal_sigma: ML estimate of \(\sigma\) - optimal_length_scale: ML estimate of \(\ell\) - max_lml: Maximum log-marginal likelihood - null_lml: LML at \(\sigma \approx 0\) (null hypothesis) - bayes_factor: exp(max_lml - null_lml) - resolution_info: Bin resolution diagnostics - auto_selected_ranges: Dict with 'sigma_range', 'length_scale_range', 'method'

Source code in src/primefeat/gp/landscape.py
def compute_lml_landscape(
    delta_values: np.ndarray,
    log_k: np.ndarray,
    sigma_range: Optional[Tuple[float, float]] = None,
    length_scale_range: Optional[Tuple[float, float]] = None,
    n_sigma: int = 50,
    n_length: int = 50,
    noise_level: Optional[float] = None,
    posterior_cov: Optional[np.ndarray] = None,
    nbins: Optional[int] = None,
    k_start: Optional[float] = None,
    k_end: Optional[float] = None,
    auto_sigma_factor: float = 3.0,
    auto_length_fallback: float = 0.3,
    validate_ranges: bool = True,
    kernel_type: KernelType = KernelType.RBF,
    kernel_params: Optional[Dict[str, Any]] = None,
    pre_optimize_extra_params: bool = True,
    grid_config: Optional[GridSearchConfig] = None,
    binning: Optional[BinningScheme] = None,
    verbose: bool = True,
) -> Dict:
    """
    Compute log-marginal likelihood landscape in $(\\sigma, \\ell)$ hyperparameter space.

    This visualizes how well different GP models explain the data, allowing us to:

    1. Test whether signal variance $\\sigma$ is significantly non-zero (evidence for features)
    2. Infer characteristic length scale $\\ell$ of features (sharp vs smooth)
    3. Assess parameter uncertainty and degeneracies (ridge structures)
    4. Compute Bayes factors for model comparison (signal vs noise)

    Supports multiple kernel types and intelligent automatic hyperparameter range selection.

    Args:
        delta_values: Observed $\\delta(k)$ values (single sample or posterior mean), shape (n,).
        log_k: $\\log(k)$ values, shape (n, 1) or (n,).
        sigma_range: (min, max) for signal std $\\sigma$ (None = auto-determine from data).
        length_scale_range: (min, max) for length scale $\\ell$ (None = auto from resolution).
        n_sigma: Number of $\\sigma$ grid points.
        n_length: Number of length scale $\\ell$ grid points.
        noise_level: Fixed diagonal noise $\\sigma_n^2$ (used if posterior_cov=None).
        posterior_cov: Full posterior covariance matrix $\\Sigma_{\\mathrm{post}}$ (RECOMMENDED, nbins x nbins).
        nbins: Number of bins for automatic length scale range (recommended).
        k_start: Start of $k$-range in $\\mathrm{Mpc}^{-1}$ for automatic length scale range.
        k_end: End of $k$-range in $\\mathrm{Mpc}^{-1}$ for automatic length scale range.
        auto_sigma_factor: Multiplier for empirical_std when auto-determining sigma_max.
        auto_length_fallback: Fraction of log_k range for length scale estimation fallback.
        validate_ranges: If True, validate user-provided ranges and warn if unreasonable.
        kernel_type: Type of kernel to use (default: KernelType.RBF).
        kernel_params: Fixed kernel-specific parameters (not explored in landscape).
        pre_optimize_extra_params: If True, pre-optimize extra kernel hyperparameters.
        grid_config: Optional GridSearchConfig object (supersedes other parameters if provided).
        binning: Optional BinningScheme object (supersedes nbins, k_start, k_end if provided).
        verbose: Whether to print results.

    Returns:
        Dictionary containing:
            - sigma_grid: 1D array of $\\sigma$ values
            - length_scale_grid: 1D array of $\\ell$ values
            - lml_grid: 2D array of log-marginal likelihoods (n_sigma, n_length)
            - optimal_sigma: ML estimate of $\\sigma$
            - optimal_length_scale: ML estimate of $\\ell$
            - max_lml: Maximum log-marginal likelihood
            - null_lml: LML at $\\sigma \\approx 0$ (null hypothesis)
            - bayes_factor: exp(max_lml - null_lml)
            - resolution_info: Bin resolution diagnostics
            - auto_selected_ranges: Dict with 'sigma_range', 'length_scale_range', 'method'
    """
    if grid_config is not None:
        if sigma_range is None:
            sigma_range = grid_config.sigma_range
        if length_scale_range is None:
            length_scale_range = grid_config.length_scale_range
        if n_sigma == 50:
            n_sigma = grid_config.n_sigma
        if n_length == 50:
            n_length = grid_config.n_length
        if auto_sigma_factor == 3.0:
            auto_sigma_factor = grid_config.auto_sigma_factor
        if auto_length_fallback == 0.3:
            auto_length_fallback = grid_config.auto_length_fallback

    log_k = np.asarray(log_k).reshape(-1, 1)
    delta_values = np.asarray(delta_values).ravel()
    n = len(delta_values)

    empirical_std = np.std(delta_values, ddof=1)
    empirical_mean = np.abs(np.mean(delta_values))

    use_full_cov = posterior_cov is not None
    if use_full_cov:
        noise_method = "full posterior covariance (RECOMMENDED)"
        noise_scale = np.sqrt(np.mean(np.diag(posterior_cov)))
    else:
        if noise_level is None:
            noise_level = estimate_noise_level_from_data(
                delta_values, fraction_of_std=0.5
            )
            noise_method = "diagonal noise (auto, 0.5 × empirical_std)"
        else:
            noise_method = "diagonal noise (user-provided)"
        noise_scale = noise_level

    console.print("=" * 70)
    console.print("HYPERPARAMETER RANGE SELECTION")
    console.print("=" * 70)
    console.print(f"Data characteristics:")
    console.print(f"  N bins: {n}")
    console.print(f"  Empirical mean: {empirical_mean:.4f}")
    console.print(f"  Empirical std: {empirical_std:.4f}")
    console.print(f"  Noise model: {noise_method}")
    if use_full_cov:
        console.print(f"  Noise scale (√diag): {noise_scale:.4f}")
    else:
        console.print(f"  Noise level (diagonal): {noise_scale:.4f}")

    sigma_range, sigma_method = _determine_sigma_range(
        delta_values, noise_level, sigma_range, auto_sigma_factor, verbose=True
    )

    resolution_info = None
    if binning is not None or (nbins is not None and k_start is not None and k_end is not None):
        resolution_info = compute_bin_resolution(nbins, k_start, k_end, binning)

    length_scale_range, length_method = _determine_length_scale_range(
        delta_values,
        log_k,
        length_scale_range,
        resolution_info,
        auto_length_fallback,
        nbins,
        verbose=True,
    )

    auto_selected_ranges = {
        "sigma_range": sigma_range,
        "length_scale_range": length_scale_range,
        "sigma_method": sigma_method,
        "length_method": length_method,
    }

    if validate_ranges:
        console.print(f"\nValidating hyperparameter ranges...")
        is_valid = validate_hyperparameter_ranges(
            sigma_range,
            length_scale_range,
            noise_level,
            empirical_std,
            resolution_info,
            warn=True,
        )
        if is_valid:
            console.print("  ✓ All ranges pass validation checks")
        else:
            console.print("  ⚠ Some validation warnings above - review carefully")

    console.print("=" * 70 + "\n")

    if kernel_params is None:
        kernel_params = {}

    # Preserve original kernel_params before optimization
    original_kernel_params = kernel_params.copy()
    optimized_kernel_params = kernel_params

    if pre_optimize_extra_params and _kernel_has_extra_hyperparams(kernel_type):
        console.print("Pre-optimising extra hyperparameters via gradient descent...")
        sigma_init = float(np.mean(sigma_range))
        length_scale_init = float(np.mean(length_scale_range))
        noise_cov_for_opt = (
            posterior_cov if use_full_cov else noise_level**2 * np.eye(n)
        )
        optimized_kernel_params = _pre_optimize_extra_params(
            kernel_type,
            kernel_params,
            delta_values,
            log_k.ravel(),
            noise_cov_for_opt,
            sigma_init,
            length_scale_init,
            verbose=True,
        )

    sigma_grid = np.linspace(sigma_range[0], sigma_range[1], n_sigma)
    length_scale_grid = np.linspace(
        length_scale_range[0], length_scale_range[1], n_length
    )

    console.print(f"\nComputing log-marginal likelihood on {n_sigma} × {n_length} grid...")
    console.print(f"  Kernel: {kernel_type.value}")
    if optimized_kernel_params:
        console.print(f"  Kernel params: {optimized_kernel_params}")
    console.print(f"  σ range: [{sigma_range[0]:.4f}, {sigma_range[1]:.4f}]")
    console.print(f"  ℓ range: [{length_scale_range[0]:.3f}, {length_scale_range[1]:.3f}]")

    base_kernels = _precompute_kernel_matrices(
        log_k, length_scale_grid, kernel_type, optimized_kernel_params, verbose=True
    )

    if use_full_cov:
        K_noise = posterior_cov
    else:
        K_noise = noise_level**2 * np.eye(n)

    lml_grid = _compute_lml_grid(
        delta_values, sigma_grid, length_scale_grid, base_kernels, K_noise, verbose=True
    )

    opt_results = _find_optimal_hyperparams(lml_grid, sigma_grid, length_scale_grid)

    _display_landscape_results(
        opt_results, kernel_type, optimized_kernel_params, sigma_grid, resolution_info, verbose=verbose
    )

    return {
        "sigma_grid": sigma_grid,
        "length_scale_grid": length_scale_grid,
        "lml_grid": lml_grid,
        "optimal_sigma": opt_results["optimal_sigma"],
        "optimal_length_scale": opt_results["optimal_length_scale"],
        "max_lml": opt_results["max_lml"],
        "null_lml": opt_results["null_lml"],
        "bayes_factor": opt_results["bayes_factor"],
        "delta_values": delta_values,
        "log_k": log_k,
        "resolution_info": resolution_info,
        "noise_level": noise_level,
        "auto_selected_ranges": auto_selected_ranges,
        "kernel_type": kernel_type,
        "kernel_params": original_kernel_params,
        "optimized_kernel_params": optimized_kernel_params,
    }

compare_kernel_likelihoods(delta_values, log_k, kernel_configs, noise_level=None, noise_cov=None, optimize_params=False, optimize_kwargs=None)

Compare log-marginal likelihoods for different kernel configurations.

This function computes the LML for each provided kernel configuration, allowing direct comparison of how well different kernels explain the data.

PARAMETER DESCRIPTION
delta_values

Observed \(\delta(k)\) values, shape (n,).

TYPE: ndarray

log_k

\(\log(k)\) values, shape (n,) or (n, 1).

TYPE: ndarray

kernel_configs

Dictionary mapping names to KernelConfig objects.

TYPE: Dict[str, KernelConfig]

noise_level

Diagonal noise standard deviation \(\sigma_n\) (if noise_cov not provided).

TYPE: Optional[float] DEFAULT: None

noise_cov

Full posterior covariance matrix \(\Sigma_{\mathrm{post}}\) (recommended).

TYPE: Optional[ndarray] DEFAULT: None

optimize_params

If True, use gradient-based optimisation to refine each kernel's hyperparameters before comparing.

TYPE: bool DEFAULT: False

optimize_kwargs

Extra keyword arguments forwarded to optimize().

TYPE: Optional[Dict[str, Any]] DEFAULT: None

RETURNS DESCRIPTION
Dict[str, Dict]

Dictionary with: - 'results': Dict mapping kernel name to {lml, config, opt_result (if optimised)} - 'best_kernel': Name of kernel with highest LML - 'best_lml': Highest log-marginal likelihood - 'bayes_factors': Dict of Bayes factors relative to worst model

Source code in src/primefeat/gp/landscape.py
def compare_kernel_likelihoods(
    delta_values: np.ndarray,
    log_k: np.ndarray,
    kernel_configs: Dict[str, KernelConfig],
    noise_level: Optional[float] = None,
    noise_cov: Optional[np.ndarray] = None,
    optimize_params: bool = False,
    optimize_kwargs: Optional[Dict[str, Any]] = None,
) -> Dict[str, Dict]:
    """
    Compare log-marginal likelihoods for different kernel configurations.

    This function computes the LML for each provided kernel configuration,
    allowing direct comparison of how well different kernels explain the data.

    Args:
        delta_values: Observed $\\delta(k)$ values, shape (n,).
        log_k: $\\log(k)$ values, shape (n,) or (n, 1).
        kernel_configs: Dictionary mapping names to KernelConfig objects.
        noise_level: Diagonal noise standard deviation $\\sigma_n$ (if noise_cov not provided).
        noise_cov: Full posterior covariance matrix $\\Sigma_{\\mathrm{post}}$ (recommended).
        optimize_params: If True, use gradient-based optimisation to refine each kernel's
                  hyperparameters before comparing.
        optimize_kwargs: Extra keyword arguments forwarded to optimize().

    Returns:
        Dictionary with:
            - 'results': Dict mapping kernel name to {lml, config, opt_result (if optimised)}
            - 'best_kernel': Name of kernel with highest LML
            - 'best_lml': Highest log-marginal likelihood
            - 'bayes_factors': Dict of Bayes factors relative to worst model
    """
    log_k = np.asarray(log_k).reshape(-1, 1)
    delta_values = np.asarray(delta_values).ravel()

    n = len(delta_values)
    if noise_cov is not None:
        _noise_cov_for_opt = noise_cov
    elif noise_level is not None:
        _noise_cov_for_opt = noise_level**2 * np.eye(n)
    else:
        _noise_cov_for_opt = np.zeros((n, n))

    opt_kwargs = optimize_kwargs or {}

    results = {}
    lmls = {}

    console.print("=" * 60)
    console.print("KERNEL COMPARISON" + (" (with gradient optimization)" if optimize_params else ""))
    console.print("=" * 60)

    for name, config in kernel_configs.items():
        opt_result = None

        if optimize_params:
            try:
                from .optimize import optimize

                opt_result = optimize(
                    log_k.ravel(),
                    delta_values,
                    _noise_cov_for_opt,
                    config,
                    **opt_kwargs,
                )
                if opt_result is not None:
                    used_config = opt_result.optimized_config
                    lml = opt_result.final_lml
                else:
                    used_config = config
                    lml = compute_log_marginal_likelihood(
                        delta_values,
                        log_k,
                        kernel_config=config,
                        noise_level=noise_level,
                        noise_cov=noise_cov,
                    )
            except Exception as exc:
                warnings.warn(
                    f"Gradient optimization failed for kernel '{name}' ({exc}). "
                    "Falling back to fixed-hyperparameter LML."
                )
                used_config = config
                lml = compute_log_marginal_likelihood(
                    delta_values,
                    log_k,
                    kernel_config=config,
                    noise_level=noise_level,
                    noise_cov=noise_cov,
                )
        else:
            used_config = config
            lml = compute_log_marginal_likelihood(
                delta_values,
                log_k,
                kernel_config=config,
                noise_level=noise_level,
                noise_cov=noise_cov,
            )

        results[name] = {
            "lml": lml,
            "config": used_config,
            "description": used_config.describe(),
            "opt_result": opt_result,
        }
        lmls[name] = lml
        console.print(f"  {name}: LML = {lml:.2f}")
        console.print(f"    {used_config.describe()}")
        if opt_result is not None:
            console.print(f"    (ΔLML from optimization: {opt_result.improvement():+.4f})")

    best_name = max(lmls.keys(), key=lambda k: lmls[k])
    best_lml = lmls[best_name]
    worst_lml = min(lmls.values())

    bayes_factors = {name: np.exp(lml - worst_lml) for name, lml in lmls.items()}

    console.print(f"\nBest kernel: {best_name} (LML = {best_lml:.2f})")
    console.print("\nBayes factors (relative to worst):")
    for name, bf in sorted(bayes_factors.items(), key=lambda x: -x[1]):
        console.print(f"  {name}: {bf:.2e}")
    console.print("=" * 60)

    return {
        "results": results,
        "best_kernel": best_name,
        "best_lml": best_lml,
        "bayes_factors": bayes_factors,
    }