Skip to content

API Reference: Hyperparameter Optimization

Hyperparameter optimization finds the kernel parameters (sigma2, ell, and optionally period) that maximize the log marginal likelihood of the training data.

\[\hat\theta = \arg\max_\theta \log p(\mathbf{y} \mid X, \theta)\]

fit

gp4c.fit(obs, kernel='rbf', bounds=None, method='differential_evolution', x0=None, **optimizer_kwargs)

Optimize GP hyperparameters by maximizing the log marginal likelihood.

Parameters

Parameter Type Default Description
obs Observations Training data. Supports f and h observations. g observations are not supported.
kernel str 'rbf' Kernel type: 'rbf', 'matern52', 'matern32', 'periodic', 'locally_periodic'
bounds dict \| None None Parameter bounds. Partial dicts are merged with defaults.
method str 'differential_evolution' Optimization method (see below)
x0 dict \| None None Initial parameter values. Required for local methods.
**optimizer_kwargs Forwarded to the underlying scipy optimizer

Default bounds

Parameter Default range
sigma2 (1e-4, 100.0)
ell (1e-4, 10.0)
period (1e-2, 100.0) (periodic kernels only)

Supply a partial bounds dict to override only the parameters you care about.

Optimization methods

Global methods — no initial guess required, recommended for hyperparameter search:

Method Description
'differential_evolution' Genetic-algorithm-based global optimizer (default)
'dual_annealing' Simulated annealing with local search
'shgo' Simplicial homology global optimization
'basinhopping' Basin-hopping with embedded local minimizer

Local methods — faster but require x0 and can get stuck in local optima:

Method Description
'L-BFGS-B' Quasi-Newton with box constraints
'Nelder-Mead' Derivative-free simplex
'Powell' Derivative-free direction set method
'SLSQP' Sequential Least Squares Programming
Any scipy.optimize.minimize method Requires x0

Returns

OptimizationResult with fields:

Field Type Description
sigma2 float Optimized variance
ell float Optimized length scale
period float \| None Optimized period (None for non-periodic kernels)
log_likelihood float Log marginal likelihood at the optimum
success bool Convergence flag
scipy_result Any Raw scipy.optimize.OptimizeResult

Raises

  • ValueError — unsupported kernel, g observations present, h observations with matern32, or x0 out of bounds
  • TypeErrorobs is not an Observations instance

compute_log_marginal_likelihood

gp4c.optimize.compute_log_marginal_likelihood(obs, sigma2, ell, period, kernel, ell_p=None)

Compute the log marginal likelihood directly.

\[\log p(\mathbf{y} \mid X,\theta) = -\tfrac{1}{2}\mathbf{y}^\top K^{-1}\mathbf{y} - \tfrac{1}{2}\log|K| - \tfrac{n}{2}\log 2\pi\]

This is a thin wrapper around the C implementation. For most use cases, prefer gp4c.log_marginal_likelihood (same function, exported from the top-level package).

Parameters

Parameter Type Default Description
obs Observations Training data
sigma2 float Kernel variance
ell float SE decay length scale
period float Period (required; ignored by non-periodic kernels)
kernel str Kernel type
ell_p float \| None None Periodic length scale; defaults to ell

Returns

float — log marginal likelihood value. More positive is better.


OptimizationResult

See Types Reference: OptimizationResult.


Examples

1. Basic usage — RBF kernel

import numpy as np
import gp4c

rng = np.random.default_rng(42)
x = np.linspace(0, 10, 50)
y = np.sin(x) + rng.normal(0, 0.1, len(x))

obs = gp4c.Observations(x_f=x, y_f=y, noise_f=0.01)
result = gp4c.fit(obs, kernel='rbf')

print(f"sigma2 = {result.sigma2:.3f}")
print(f"ell    = {result.ell:.3f}")
print(f"log p(y|X,θ) = {result.log_likelihood:.3f}")
print(f"converged: {result.success}")

2. Local optimizer — L-BFGS-B

Local methods are faster when you already have a reasonable starting point.

import numpy as np
import gp4c

obs = gp4c.Observations(x_f=x, y_f=y, noise_f=0.01)

result = gp4c.fit(
    obs,
    kernel='rbf',
    method='L-BFGS-B',
    x0={'sigma2': 1.0, 'ell': 1.0},
    bounds={'sigma2': (0.1, 5.0), 'ell': (0.1, 3.0)},
)

print(f"sigma2={result.sigma2:.3f}, ell={result.ell:.3f}")

3. Periodic kernel

Constrain the period search range via bounds to avoid aliasing.

import numpy as np
import gp4c

rng = np.random.default_rng(0)
x = np.linspace(0, 20, 100)
y = np.sin(2 * np.pi * x / 3.0) + rng.normal(0, 0.1, len(x))  # period = 3.0

obs = gp4c.Observations(x_f=x, y_f=y, noise_f=0.01)

result = gp4c.fit(
    obs,
    kernel='periodic',
    bounds={'period': (1.0, 6.0)},
    method='differential_evolution',
    maxiter=300,
    seed=42,
)

print(f"sigma2={result.sigma2:.3f}, ell={result.ell:.3f}, period={result.period:.3f}")
# Expect period close to 3.0

4. Derivative observations

Including derivative data can substantially sharpen the hyperparameter estimate.

import numpy as np
import gp4c

rng = np.random.default_rng(1)
x_train = np.linspace(0, 5, 20)

obs = gp4c.Observations(
    x_f=x_train,
    y_f=np.sin(x_train) + rng.normal(0, 0.05, len(x_train)),
    noise_f=0.01,
    x_h=x_train,
    y_h=np.cos(x_train) + rng.normal(0, 0.05, len(x_train)),
    noise_h=0.01,
)

result = gp4c.fit(obs, kernel='matern52')
print(f"sigma2={result.sigma2:.3f}, ell={result.ell:.3f}")

Note

All kernels support derivative observations (x_h). Integral observations (x_g) are not supported for 'periodic' and 'locally_periodic' kernels.

5. Using optimized parameters for sampling

Once you have the optimized hyperparameters, feed them directly into sample_posterior.

import numpy as np
import gp4c

# Observed data
x_train = np.linspace(0, 5, 20)
obs = gp4c.Observations(x_f=x_train, y_f=np.sin(x_train), noise_f=0.01)

# Optimize
result = gp4c.fit(obs, kernel='rbf')

# Posterior at fine grid using optimized params
x_test = np.linspace(0, 7, 200)
spec = gp4c.SamplingSpec(x_f=x_test)

posterior = gp4c.sample_posterior(
    obs, spec,
    sigma2=result.sigma2,
    ell=result.ell,
    kernel='rbf',
    n_samples=100,
)

# posterior.f_mean and posterior.f_std give the uncertainty band

Next Steps