.. _dummy-comparing-algorithms: ############################################################ Comparing inversion algorithms with the toy Gaussian model ############################################################ pyCIF ships four inversion algorithms that can all be exercised on the :doc:`Toy Gaussian Model` without any external data. This tutorial describes each algorithm, shows the key YAML options, and links to ready-to-run examples. Before starting, run the :doc:`first forward simulation<../first-simu/toy-gaussian>` and prepare ``ref_obsvect`` as described in :doc:`toy-gaussian`. .. contents:: :local: Overview ======== The table below summarises the four algorithms and their main trade-offs. .. list-table:: :header-rows: 1 :widths: 20 30 25 25 * - Algorithm - Mode plugin - When to use - Posterior uncertainties * - 4D-VAR / M1QN3 - ``4dvar`` + ``M1QN3`` - Large control vectors, iterative gradient descent - Monte Carlo (``montecarlo`` block) * - 4D-VAR / congrad - ``4dvar`` + ``congrad`` - Same as M1QN3; exact posterior variance from Lanczos vectors - ``save_uncertainties: true`` * - Analytical - ``analytic`` - Small control vectors (bands or global); exact solution in one pass - Exact posterior covariance matrix * - EnSRF - ``EnSRF`` - Ensemble-based; no adjoint required - Ensemble spread 4D-VAR with M1QN3 ================== M1QN3 is a limited-memory quasi-Newton minimiser. It is the default choice for large-scale atmospheric inversions. Key YAML options: .. code-block:: yaml mode: plugin: name: 4dvar version: std minimizer: plugin: name: M1QN3 version: std maxiter: 25 # maximum number of gradient evaluations nsim: 25 # maximum number of forward/adjoint runs epsg: 0.0002 # convergence criterion on gradient norm df1: 0.5 # expected initial cost-function reduction simulator: plugin: name: gausscost version: std reload_from_previous: true save_out_netcdf: true Ready-to-run YAML examples (all spatial resolutions): * :doc:`Full pixel ` (recommended starting point) * :doc:`Bands ` * :doc:`Global ` 4D-VAR with congrad ===================== The conjugate-gradient minimiser (congrad) solves the same variational problem as M1QN3 but uses the Lanczos recursion to build an approximation of the inverse Hessian. This enables an analytical estimate of the posterior uncertainty without any additional Monte Carlo runs. Key YAML differences from M1QN3: .. code-block:: yaml mode: minimizer: plugin: name: congrad version: std save_uncertainties: true # enables Lanczos-based posterior variance When ``save_uncertainties: true``, the output NetCDF will contain ``pa_std`` (posterior standard deviation) in addition to the optimised fluxes. Ready-to-run YAML examples: * :doc:`Full pixel ` * :doc:`Bands ` * :doc:`Global ` Analytical inversion ===================== The analytical algorithm computes the exact posterior in a single pass using the closed-form Bayesian solution: .. math:: x_a = x_b + B H^T (H B H^T + R)^{-1} (y - H x_b) This requires assembling and inverting the full error-covariance matrices, so it is only tractable for small control vectors. Use the ``bands`` or ``global`` resolution (see :ref:`spatial-resolution-variants` below) to keep the system manageable. Key YAML options: .. code-block:: yaml mode: plugin: name: analytic version: std dump_nc_base_control: true # write posterior covariance to NetCDF The posterior covariance matrix is stored in ``controlvect/flux/CH4/controlvect_flux_CH4.nc`` (variable ``pa``). Ready-to-run YAML examples: * :doc:`Bands ` * :doc:`Full pixel ` (slow for large grids) * :doc:`Global ` EnSRF (Ensemble Square-Root Filter) ===================================== The Ensemble Square-Root Filter propagates an ensemble of control-vector perturbations through the forward model and uses the ensemble covariance to update the state. It does not require an adjoint and is well suited to non-Gaussian problems. Key YAML options: .. code-block:: yaml mode: plugin: name: EnSRF version: std nsample: 50 # number of ensemble members batch_sampling: false max_nsamples_per_run: 1 # one forward run per ensemble member include_system_samples: false Larger ``nsample`` reduces sampling noise but increases cost proportionally. The ensemble spread of the posterior samples serves as the uncertainty estimate. Ready-to-run YAML examples: * :doc:`Full pixel ` * :doc:`Bands ` * :doc:`Global ` .. _spatial-resolution-variants: Spatial resolution of the control vector ========================================== Independent of the inversion algorithm, pyCIF supports three aggregation levels for the flux control vector, set via the ``hresol`` key in the data vector: .. list-table:: :header-rows: 1 :widths: 15 45 40 * - ``hresol`` - Description - Typical use * - ``hpixels`` - Each model grid cell is a separate control variable (full resolution) - Gradient-based methods (M1QN3, congrad) * - ``ibands`` - Grid cells are aggregated into user-defined rectangular bands - Analytical or ensemble methods; also faster for variational methods * - ``global`` - A single scalar multiplier for the entire domain - Sanity checks; very fast analytical solution Example ``datavect`` snippet for the ``bands`` resolution: .. code-block:: yaml datavect: components: flux: parameters: CH4: hresol: ibands tresol: 4D bands_i: [0, 3, 6, 9, 12] bands_j: [0, 3, 6, 9, 12, 15, 18] The YAML file names in :doc:`/yaml-examples/dummy/index` encode the resolution: ``_full_``, ``_bands_``, or ``_global_``. ============= Going further ============= * For a side-by-side benchmarking of all algorithms and resolutions — including publication-quality figures of flux increments and uncertainty reduction — see :doc:`../further-inversions/article-examples`. * To run multiple inversions in sequence (restarting, extending the time window, or shortening it), see the pages under :doc:`../further-inversions/index`. * All available Toy Gaussian YAML examples: :doc:`/yaml-examples/dummy/index`