API reference
Top-level
fusion
FUSION — score ice-sheet ensembles against observations; weight projections.
ConvergenceWarning
Bases: UserWarning
Emitted when post-sampling convergence checks look unhealthy.
Config
Result
dataclass
Outputs of a FUSION run.
Held together as a single object so callers can pass intermediate state between steps without juggling multiple variables.
load_config(path)
Load and validate a FUSION config from a YAML file.
Validation errors (and other read errors) are raised as ValueError
with the path included in the message.
load_data(cfg)
Fetch observations and load + standardise the ensemble.
Returns {"obs": {...}, "ensemble": <Dataset>}. Raises
NotImplementedError if the ensemble grid does not match the
obs grid (regridding is a v1.1 feature).
prepare(cfg, data)
Flatten + mask the rate-of-change inputs the PyMC model consumes.
sample(cfg, prepared)
Run hierarchical PyMC inference and return the trace.
project(cfg, trace, data)
Apply posterior weights to per-member SLE-2100 values.
run(cfg)
One-call pipeline: load_data → prepare → sample → project.
save_weights(result, path)
Write the weights DataFrame to CSV.
save_metadata(result, path)
Write run_metadata.json with full reproducibility info.
Includes the resolved Config (so the validation harness can read
off observations.version, inference.subsample.seed, the
stream weights, etc.) alongside the fusion version and a minimal
environment fingerprint. Anything in result.metadata is merged
in last so callers can attach extra fields (e.g. file hashes).
plot_projection(result, path)
Render the SLE distribution with its median and 5–95% credible interval.
projection_summary(projection, *, lower_q=0.05, upper_q=0.95)
Summary statistics of the weighted SLE distribution.
Returns the median (the central estimate to report), mean, standard
deviation, and a credible interval at [lower_q, upper_q] (default
5–95%). The median plus the credible interval is the reporting form of
the projection: a single number with a credible range.
sampler_diagnostics(trace, var_names=_CONVERGENCE_VARS)
Summarize MCMC sampler health for the sampled parameters.
Returns the worst-case R-hat and smallest bulk/tail effective sample
size across var_names, plus the number of divergent transitions.
max_rhat > 1.01, low ESS, or any divergence means the posterior
(and the member weights derived from it) should not be trusted without
rerunning with more tuning/draws. This is a different question from
:func:weight_stability_across_seeds, which probes subsample sensitivity.
weight_stability_across_seeds(weights_list)
Per-member standard deviation across multiple seeded runs.
A common sanity check: rerun the pipeline with several
inference.subsample.seed values and pass the resulting
per-member weight vectors here. Large σ on a member means the
weight is sensitive to which 20k pixels happened to be drawn.
Each input is a member-coorded DataArray (e.g. a column of the
weights table). Stacking along a new seed dim aligns runs by
member label, so the result is correct even if two runs enumerate
members in different orders.
Pipeline
fusion.pipeline
Top-level FUSION pipeline.
Each step is callable on its own; run composes them. The pipeline
shape is load_data → prepare → sample → project; prepare
replaces the earlier score name (the v1 function flattens inputs
for the PyMC model rather than producing scalar scores).
LoadedData
load_data(cfg)
Fetch observations and load + standardise the ensemble.
Returns {"obs": {...}, "ensemble": <Dataset>}. Raises
NotImplementedError if the ensemble grid does not match the
obs grid (regridding is a v1.1 feature).
prepare(cfg, data)
Flatten + mask the rate-of-change inputs the PyMC model consumes.
sample(cfg, prepared)
Run hierarchical PyMC inference and return the trace.
project(cfg, trace, data)
Apply posterior weights to per-member SLE-2100 values.
run(cfg)
One-call pipeline: load_data → prepare → sample → project.
plug_in_weights(prepared, trace)
Direct port of compute_model_weights from the prototype.
Returns (weights, loglik) where weights is the
N-normalised softmax (length M, sums to 1) and loglik is the
N-scaled per-member Gaussian log-likelihood (loglik / y_obs.size,
length M), evaluated at the posterior-mean sigma_base_* /
beta_*. This second value matches the prototype's
compute_model_weights second return and the log_likelihood
column of model_weights_table.csv. Both are deterministic given
the trace's posterior means and the prepared arrays, so the
validation harness uses them as the canonical bit-exact comparison
points.
I/O
fusion.io
Persistence helpers for FUSION run outputs.
save_weights(result, path)
Write the weights DataFrame to CSV.
save_metadata(result, path)
Write run_metadata.json with full reproducibility info.
Includes the resolved Config (so the validation harness can read
off observations.version, inference.subsample.seed, the
stream weights, etc.) alongside the fusion version and a minimal
environment fingerprint. Anything in result.metadata is merged
in last so callers can attach extra fields (e.g. file hashes).
plot_projection(result, path)
Render the SLE distribution with its median and 5–95% credible interval.
Result
fusion.result.Result
dataclass
Outputs of a FUSION run.
Held together as a single object so callers can pass intermediate state between steps without juggling multiple variables.