Comparing inversion algorithms with the toy Gaussian model#
pyCIF ships four inversion algorithms that can all be exercised on the Toy Gaussian Model without any external data. This tutorial describes each algorithm, shows the key YAML options, and links to ready-to-run examples.
Before starting, run the first forward simulation
and prepare ref_obsvect as described in First inversion with the toy Gaussian model.
Overview#
The table below summarises the four algorithms and their main trade-offs.
Algorithm |
Mode plugin |
When to use |
Posterior uncertainties |
|---|---|---|---|
4D-VAR / M1QN3 |
|
Large control vectors, iterative gradient descent |
Monte Carlo ( |
4D-VAR / congrad |
|
Same as M1QN3; exact posterior variance from Lanczos vectors |
|
Analytical |
|
Small control vectors (bands or global); exact solution in one pass |
Exact posterior covariance matrix |
EnSRF |
|
Ensemble-based; no adjoint required |
Ensemble spread |
4D-VAR with M1QN3#
M1QN3 is a limited-memory quasi-Newton minimiser. It is the default choice for large-scale atmospheric inversions.
Key YAML options:
mode:
plugin:
name: 4dvar
version: std
minimizer:
plugin:
name: M1QN3
version: std
maxiter: 25 # maximum number of gradient evaluations
nsim: 25 # maximum number of forward/adjoint runs
epsg: 0.0002 # convergence criterion on gradient norm
df1: 0.5 # expected initial cost-function reduction
simulator:
plugin:
name: gausscost
version: std
reload_from_previous: true
save_out_netcdf: true
Ready-to-run YAML examples (all spatial resolutions):
Full pixel (recommended starting point)
4D-VAR with congrad#
The conjugate-gradient minimiser (congrad) solves the same variational problem as M1QN3 but uses the Lanczos recursion to build an approximation of the inverse Hessian. This enables an analytical estimate of the posterior uncertainty without any additional Monte Carlo runs.
Key YAML differences from M1QN3:
mode:
minimizer:
plugin:
name: congrad
version: std
save_uncertainties: true # enables Lanczos-based posterior variance
When save_uncertainties: true, the output NetCDF will contain pa_std
(posterior standard deviation) in addition to the optimised fluxes.
Ready-to-run YAML examples:
Analytical inversion#
The analytical algorithm computes the exact posterior in a single pass using the closed-form Bayesian solution:
This requires assembling and inverting the full error-covariance matrices, so it
is only tractable for small control vectors. Use the bands or global
resolution (see Spatial resolution of the control vector below) to keep the system
manageable.
Key YAML options:
mode:
plugin:
name: analytic
version: std
dump_nc_base_control: true # write posterior covariance to NetCDF
The posterior covariance matrix is stored in
controlvect/flux/CH4/controlvect_flux_CH4.nc (variable pa).
Ready-to-run YAML examples:
Full pixel (slow for large grids)
EnSRF (Ensemble Square-Root Filter)#
The Ensemble Square-Root Filter propagates an ensemble of control-vector perturbations through the forward model and uses the ensemble covariance to update the state. It does not require an adjoint and is well suited to non-Gaussian problems.
Key YAML options:
mode:
plugin:
name: EnSRF
version: std
nsample: 50 # number of ensemble members
batch_sampling: false
max_nsamples_per_run: 1 # one forward run per ensemble member
include_system_samples: false
Larger nsample reduces sampling noise but increases cost proportionally.
The ensemble spread of the posterior samples serves as the uncertainty estimate.
Ready-to-run YAML examples:
Spatial resolution of the control vector#
Independent of the inversion algorithm, pyCIF supports three aggregation
levels for the flux control vector, set via the hresol key in the data
vector:
|
Description |
Typical use |
|---|---|---|
|
Each model grid cell is a separate control variable (full resolution) |
Gradient-based methods (M1QN3, congrad) |
|
Grid cells are aggregated into user-defined rectangular bands |
Analytical or ensemble methods; also faster for variational methods |
|
A single scalar multiplier for the entire domain |
Sanity checks; very fast analytical solution |
Example datavect snippet for the bands resolution:
datavect:
components:
flux:
parameters:
CH4:
hresol: ibands
tresol: 4D
bands_i: [0, 3, 6, 9, 12]
bands_j: [0, 3, 6, 9, 12, 15, 18]
The YAML file names in Examples for dummy encode the resolution:
_full_, _bands_, or _global_.
Going further#
For a side-by-side benchmarking of all algorithms and resolutions — including publication-quality figures of flux increments and uncertainty reduction — see Article-quality inversion benchmark (CI jobs).
To run multiple inversions in sequence (restarting, extending the time window, or shortening it), see the pages under How to run further inversions from a first one.
All available Toy Gaussian YAML examples: Examples for dummy