.. _article-examples:

#############################################
Article-quality inversion benchmark (CI jobs)
#############################################

The pyCIF CI pipeline includes two jobs — ``article`` and
``article-uncertainties`` — that run a comprehensive benchmark of all inversion
algorithms on the Toy Gaussian model and generate publication-quality figures.
This page explains what they test, how to reproduce the runs locally, and what
outputs to expect.

.. contents::
   :local:

What the CI jobs test
======================

Both jobs use the :doc:`Toy Gaussian Model</documentation/plugins/models/dummy>`
with **no external data**.

.. list-table::
   :header-rows: 1
   :widths: 25 75

   * - CI job
     - What it runs
   * - ``article``
     - All inversion algorithms (4D-VAR/M1QN3, 4D-VAR/congrad, analytical, EnSRF)
       at ``bands`` resolution.  On the default branch, also runs ``full`` and
       ``global`` resolutions.
   * - ``article-uncertainties``
     - 4D-VAR/M1QN3 with Monte Carlo posterior uncertainties, and EnSRF with an
       extended ensemble, both at ``bands`` resolution (all resolutions on the
       default branch).

These are the same configurations described in
:doc:`../first-inversion/comparing-algorithms`; the CI just runs them all
systematically and collects figures.

Outputs produced
=================

The test code generates several artefacts in ``figures_artifact/``:

.. list-table::
   :header-rows: 1
   :widths: 45 55

   * - File
     - Contents
   * - ``map_dx_{algo}_{resol}_{niter}.pdf``
     - Map of physical flux increments (prior minus posterior, in physical units)
   * - ``map_dx_scale_{algo}_{resol}_{niter}.pdf``
     - Map of normalised flux increments (in chi space)
   * - ``map_dstd_{algo}_{resol}_{niter}.pdf``
     - Map of uncertainty reduction (only when ``pa_std`` is available)
   * - ``posterior_matrix_{algo}_{resol}_{niter}.pdf``
     - Posterior error covariance matrix (only when ``pa`` is available)
   * - ``prior_matrix_{algo}_{resol}_{niter}.pdf``
     - Prior error covariance matrix

In addition, each inversion writes ``varying_cost_function.txt`` in its
``workdir``, containing the value of the cost function components
:math:`J_o` (observation misfit) and :math:`J_b` (background term)
as a function of the number of iterations.

Running the benchmark locally
===============================

Because the Toy Gaussian model requires no external data, you can reproduce
the full benchmark with a single ``pytest`` command:

.. code-block:: bash

    # Equivalent to the CI "article" job (bands resolution, all algorithms)
    pytest -m "dummy and article and inversion and not adjtltest and not uncertainties and bands"

    # Equivalent to the CI "article-uncertainties" job
    pytest -m "dummy and article and inversion and not adjtltest and uncertainties and bands"

    # Full benchmark (all resolutions) — equivalent to the default-branch jobs
    pytest -m "dummy and article and inversion and not adjtltest and not uncertainties"
    pytest -m "dummy and article and inversion and not adjtltest and uncertainties"

The figures are written to ``figures_artifact/`` relative to the repository root.
Intermediate YAML files are dumped to ``examples_artifact/dummy/``.

Running a single algorithm
---------------------------

To run just one algorithm, use the YAML examples directly as described in
:doc:`../first-inversion/comparing-algorithms`.  Prepare ``ref_obsvect`` from
a forward run, then:

.. code-block:: bash

    python -m pycif path/to/config_inversion_long_bands_4dvar_M1QN3.yml

Posterior uncertainty estimation
==================================

``article`` (no uncertainties)
---------------------------------

The standard ``article`` runs use the default uncertainty settings for each
algorithm:

* **4D-VAR/congrad**: set ``save_uncertainties: true`` in the minimiser block
  to enable Lanczos-based posterior variance at no extra cost.
* **4D-VAR/M1QN3**: no built-in uncertainty propagation.
* **Analytical**: the full posterior covariance matrix :math:`P_a = (B^{-1} + H^T R^{-1} H)^{-1}`
  is computed and stored.
* **EnSRF**: the ensemble spread is the posterior uncertainty.

``article-uncertainties``
---------------------------

The ``uncertainties`` variants add Monte Carlo estimation for the variational
methods and an extended ensemble for EnSRF:

**4D-VAR/M1QN3 with Monte Carlo:**

.. code-block:: yaml

    mode:
      plugin:
        name: 4dvar
        version: std
      minimizer:
        plugin:
          name: M1QN3
          version: std
        # ... other minimiser options ...
      montecarlo:
        nsample: 10          # number of perturbed inversions
        perturb_x: true      # perturb prior
        perturb_y: false     # do not perturb observations
        aggregate_results: true

Each Monte Carlo member reruns the full inversion on a perturbed prior.
The spread across members estimates the posterior uncertainty.
Results are aggregated into the main output NetCDF as ``pa_std``.

**EnSRF with larger ensemble:**

Simply increase ``nsample`` in the EnSRF mode block.  More members reduce
sampling noise at the cost of proportionally more forward runs.

Key YAML parameters at a glance
==================================

.. list-table::
   :header-rows: 1
   :widths: 35 65

   * - YAML key
     - Effect
   * - ``mode/montecarlo/nsample``
     - Number of Monte Carlo members for posterior uncertainty (4D-VAR)
   * - ``mode/minimizer/save_uncertainties``
     - Enable Lanczos posterior variance (congrad only)
   * - ``mode/nsample``
     - Number of ensemble members (EnSRF)
   * - ``datavect/.../hresol``
     - Control-vector resolution: ``hpixels``, ``ibands``, or ``global``
   * - ``mode/minimizer/maxiter``
     - Maximum number of gradient evaluations (4D-VAR)

=============
Going further
=============

* Full description of all algorithms: :doc:`../first-inversion/comparing-algorithms`
* All Toy Gaussian YAML examples: :doc:`/yaml-examples/dummy/index`
* Extending, shortening, or resubmitting an inversion:
  :doc:`extending`, :doc:`shortening`, :doc:`resubmitting`