Skip to content

API reference

Top-level

fusion

FUSION — score ice-sheet ensembles against observations; weight projections.

ConvergenceWarning

Bases: UserWarning

Emitted when post-sampling convergence checks look unhealthy.

Config

Bases: BaseModel

Top-level FUSION run configuration. Loaded from a YAML file via load_config.

Result dataclass

Outputs of a FUSION run.

Held together as a single object so callers can pass intermediate state between steps without juggling multiple variables.

load_config(path)

Load and validate a FUSION config from a YAML file.

Validation errors (and other read errors) are raised as ValueError with the path included in the message.

load_data(cfg)

Fetch observations and load + standardise the ensemble.

Returns {"obs": {...}, "ensemble": <Dataset>}. Raises NotImplementedError if the ensemble grid does not match the obs grid (regridding is a v1.1 feature).

prepare(cfg, data)

Flatten + mask the rate-of-change inputs the PyMC model consumes.

sample(cfg, prepared)

Run hierarchical PyMC inference and return the trace.

project(cfg, trace, data)

Apply posterior weights to per-member SLE-2100 values.

run(cfg)

One-call pipeline: load_data → prepare → sample → project.

save_weights(result, path)

Write the weights DataFrame to CSV.

save_metadata(result, path)

Write run_metadata.json with full reproducibility info.

Includes the resolved Config (so the validation harness can read off observations.version, inference.subsample.seed, the stream weights, etc.) alongside the fusion version and a minimal environment fingerprint. Anything in result.metadata is merged in last so callers can attach extra fields (e.g. file hashes).

plot_projection(result, path)

Render the SLE distribution with its median and 5–95% credible interval.

projection_summary(projection, *, lower_q=0.05, upper_q=0.95)

Summary statistics of the weighted SLE distribution.

Returns the median (the central estimate to report), mean, standard deviation, and a credible interval at [lower_q, upper_q] (default 5–95%). The median plus the credible interval is the reporting form of the projection: a single number with a credible range.

sampler_diagnostics(trace, var_names=_CONVERGENCE_VARS)

Summarize MCMC sampler health for the sampled parameters.

Returns the worst-case R-hat and smallest bulk/tail effective sample size across var_names, plus the number of divergent transitions. max_rhat > 1.01, low ESS, or any divergence means the posterior (and the member weights derived from it) should not be trusted without rerunning with more tuning/draws. This is a different question from :func:weight_stability_across_seeds, which probes subsample sensitivity.

weight_stability_across_seeds(weights_list)

Per-member standard deviation across multiple seeded runs.

A common sanity check: rerun the pipeline with several inference.subsample.seed values and pass the resulting per-member weight vectors here. Large σ on a member means the weight is sensitive to which 20k pixels happened to be drawn.

Each input is a member-coorded DataArray (e.g. a column of the weights table). Stacking along a new seed dim aligns runs by member label, so the result is correct even if two runs enumerate members in different orders.

Pipeline

fusion.pipeline

Top-level FUSION pipeline.

Each step is callable on its own; run composes them. The pipeline shape is load_data → prepare → sample → project; prepare replaces the earlier score name (the v1 function flattens inputs for the PyMC model rather than producing scalar scores).

LoadedData

Bases: TypedDict

Output of :func:load_data: the obs streams plus the ensemble.

load_data(cfg)

Fetch observations and load + standardise the ensemble.

Returns {"obs": {...}, "ensemble": <Dataset>}. Raises NotImplementedError if the ensemble grid does not match the obs grid (regridding is a v1.1 feature).

prepare(cfg, data)

Flatten + mask the rate-of-change inputs the PyMC model consumes.

sample(cfg, prepared)

Run hierarchical PyMC inference and return the trace.

project(cfg, trace, data)

Apply posterior weights to per-member SLE-2100 values.

run(cfg)

One-call pipeline: load_data → prepare → sample → project.

plug_in_weights(prepared, trace)

Direct port of compute_model_weights from the prototype.

Returns (weights, loglik) where weights is the N-normalised softmax (length M, sums to 1) and loglik is the N-scaled per-member Gaussian log-likelihood (loglik / y_obs.size, length M), evaluated at the posterior-mean sigma_base_* / beta_*. This second value matches the prototype's compute_model_weights second return and the log_likelihood column of model_weights_table.csv. Both are deterministic given the trace's posterior means and the prepared arrays, so the validation harness uses them as the canonical bit-exact comparison points.

I/O

fusion.io

Persistence helpers for FUSION run outputs.

save_weights(result, path)

Write the weights DataFrame to CSV.

save_metadata(result, path)

Write run_metadata.json with full reproducibility info.

Includes the resolved Config (so the validation harness can read off observations.version, inference.subsample.seed, the stream weights, etc.) alongside the fusion version and a minimal environment fingerprint. Anything in result.metadata is merged in last so callers can attach extra fields (e.g. file hashes).

plot_projection(result, path)

Render the SLE distribution with its median and 5–95% credible interval.

Result

fusion.result.Result dataclass

Outputs of a FUSION run.

Held together as a single object so callers can pass intermediate state between steps without juggling multiple variables.