.. _article-examples: ############################################# Article-quality inversion benchmark (CI jobs) ############################################# The pyCIF CI pipeline includes two jobs — ``article`` and ``article-uncertainties`` — that run a comprehensive benchmark of all inversion algorithms on the Toy Gaussian model and generate publication-quality figures. This page explains what they test, how to reproduce the runs locally, and what outputs to expect. .. contents:: :local: What the CI jobs test ====================== Both jobs use the :doc:`Toy Gaussian Model` with **no external data**. .. list-table:: :header-rows: 1 :widths: 25 75 * - CI job - What it runs * - ``article`` - All inversion algorithms (4D-VAR/M1QN3, 4D-VAR/congrad, analytical, EnSRF) at ``bands`` resolution. On the default branch, also runs ``full`` and ``global`` resolutions. * - ``article-uncertainties`` - 4D-VAR/M1QN3 with Monte Carlo posterior uncertainties, and EnSRF with an extended ensemble, both at ``bands`` resolution (all resolutions on the default branch). These are the same configurations described in :doc:`../first-inversion/comparing-algorithms`; the CI just runs them all systematically and collects figures. Outputs produced ================= The test code generates several artefacts in ``figures_artifact/``: .. list-table:: :header-rows: 1 :widths: 45 55 * - File - Contents * - ``map_dx_{algo}_{resol}_{niter}.pdf`` - Map of physical flux increments (prior minus posterior, in physical units) * - ``map_dx_scale_{algo}_{resol}_{niter}.pdf`` - Map of normalised flux increments (in chi space) * - ``map_dstd_{algo}_{resol}_{niter}.pdf`` - Map of uncertainty reduction (only when ``pa_std`` is available) * - ``posterior_matrix_{algo}_{resol}_{niter}.pdf`` - Posterior error covariance matrix (only when ``pa`` is available) * - ``prior_matrix_{algo}_{resol}_{niter}.pdf`` - Prior error covariance matrix In addition, each inversion writes ``varying_cost_function.txt`` in its ``workdir``, containing the value of the cost function components :math:`J_o` (observation misfit) and :math:`J_b` (background term) as a function of the number of iterations. Running the benchmark locally =============================== Because the Toy Gaussian model requires no external data, you can reproduce the full benchmark with a single ``pytest`` command: .. code-block:: bash # Equivalent to the CI "article" job (bands resolution, all algorithms) pytest -m "dummy and article and inversion and not adjtltest and not uncertainties and bands" # Equivalent to the CI "article-uncertainties" job pytest -m "dummy and article and inversion and not adjtltest and uncertainties and bands" # Full benchmark (all resolutions) — equivalent to the default-branch jobs pytest -m "dummy and article and inversion and not adjtltest and not uncertainties" pytest -m "dummy and article and inversion and not adjtltest and uncertainties" The figures are written to ``figures_artifact/`` relative to the repository root. Intermediate YAML files are dumped to ``examples_artifact/dummy/``. Running a single algorithm --------------------------- To run just one algorithm, use the YAML examples directly as described in :doc:`../first-inversion/comparing-algorithms`. Prepare ``ref_obsvect`` from a forward run, then: .. code-block:: bash python -m pycif path/to/config_inversion_long_bands_4dvar_M1QN3.yml Posterior uncertainty estimation ================================== ``article`` (no uncertainties) --------------------------------- The standard ``article`` runs use the default uncertainty settings for each algorithm: * **4D-VAR/congrad**: set ``save_uncertainties: true`` in the minimiser block to enable Lanczos-based posterior variance at no extra cost. * **4D-VAR/M1QN3**: no built-in uncertainty propagation. * **Analytical**: the full posterior covariance matrix :math:`P_a = (B^{-1} + H^T R^{-1} H)^{-1}` is computed and stored. * **EnSRF**: the ensemble spread is the posterior uncertainty. ``article-uncertainties`` --------------------------- The ``uncertainties`` variants add Monte Carlo estimation for the variational methods and an extended ensemble for EnSRF: **4D-VAR/M1QN3 with Monte Carlo:** .. code-block:: yaml mode: plugin: name: 4dvar version: std minimizer: plugin: name: M1QN3 version: std # ... other minimiser options ... montecarlo: nsample: 10 # number of perturbed inversions perturb_x: true # perturb prior perturb_y: false # do not perturb observations aggregate_results: true Each Monte Carlo member reruns the full inversion on a perturbed prior. The spread across members estimates the posterior uncertainty. Results are aggregated into the main output NetCDF as ``pa_std``. **EnSRF with larger ensemble:** Simply increase ``nsample`` in the EnSRF mode block. More members reduce sampling noise at the cost of proportionally more forward runs. Key YAML parameters at a glance ================================== .. list-table:: :header-rows: 1 :widths: 35 65 * - YAML key - Effect * - ``mode/montecarlo/nsample`` - Number of Monte Carlo members for posterior uncertainty (4D-VAR) * - ``mode/minimizer/save_uncertainties`` - Enable Lanczos posterior variance (congrad only) * - ``mode/nsample`` - Number of ensemble members (EnSRF) * - ``datavect/.../hresol`` - Control-vector resolution: ``hpixels``, ``ibands``, or ``global`` * - ``mode/minimizer/maxiter`` - Maximum number of gradient evaluations (4D-VAR) ============= Going further ============= * Full description of all algorithms: :doc:`../first-inversion/comparing-algorithms` * All Toy Gaussian YAML examples: :doc:`/yaml-examples/dummy/index` * Extending, shortening, or resubmitting an inversion: :doc:`extending`, :doc:`shortening`, :doc:`resubmitting`