Prepare the YAML configuration file#

Prepare the control vector#

The response-functions mode runs one simulation per element of the control vector. For each tracer, this means one simulation per horizontal dimension, per vertical dimension, and per time dimension, determined respectively by the hresol, vresol, and tresol arguments described in the control vector plugin documentation.

The control vector must therefore be small enough for the total computation time to remain manageable. A typical variational inversion control vector at pixel-level horizontal resolution is likely far too large for the response-functions mode.

See Doing a dry run to determine the size of the control vector with the response-functions mode.

Note

It is recommended to set the observation operator plugin autoflush argument (and optionally force-full-flush) to limit disk space usage during the simulations.

For response functions, a per region horizontal resolution is typically used. Here is a very simple example:

Common options for the response-functions mode#

Run mode#

Response functions can either be run in forward (fwd) or tangent-linear (tl) mode. This can be specified with the run_mode input argument.

Note

Running response functions in forward mode is generally not meaningful, as the models are non-linear.

Period length#

By default, the simulation window for each response function equals the time period of its corresponding control vector element.

Since a control vector element can affect simulated values at observation points beyond its own time period, the simulation window can be extended using the spin_down argument.

Note

Response functions for multiple independent species with different lifetimes can be run in separate pyCIF simulations with different spin_down values. The results can then be combined afterwards.

Alternatively, all response function simulation windows can be set to the full main simulation window (defined by datei and datef in the YAML configuration file) using the full_period argument.

Job submitting#

Response function simulations are independent pyCIF jobs.

Warning

Depending on the platform plugin used, these jobs may be submitted automatically to the platform scheduler. Use the job_batch_size argument to avoid submitting hundreds or thousands of jobs simultaneously.

Batch sampling#

Response functions can be grouped by simulation window using batch sampling with compatible models. This can significantly reduce the computation time and resources required. The use_batch_sampling argument enables this behavior.

Select outputs#

By default, the response-functions mode only saves the observation vector and the \(\mathbf{H}\) matrix. Additional outputs can be saved by setting the following arguments to true:

  • dump_obsvect_decompostion: dumps the \(\mathbf{H}\) matrix decomposition per control vector parameter, to get the contribution of each control vector element to each observation vector element.

  • analytical_inversion: In addition to performing an analytical inversions, dumps the \(\mathbf{B}\) and \(\mathbf{R}\) matrices, and the \(\mathbf{x}^b\) and \(\mathbf{y}\) vectors.

When dump_sparse_arrays is set to true, matrices are converted to sparse format before saving, which can save significant disk space.

Note

Sparse matrices contained in NetCDF files can be converted back to dense matrices with the function pycif.utils.sparse_array.to_dense_dataset.

example:

import xarray as xr
from pycif.utils.sparse_array import to_dense_dataset

with xr.open_dataset("/.../h_matrix.nc") as sparse_ds:
    dense_ds = to_dense_dataset(sparse_ds)

Advanced options for the response-functions mode#

Using approximated model operator#

Some models, such as LMDZ, offer options to approximate the model operator. To use an approximated model operator with the response-functions mode, set the use_model_approximation argument to true.

Multiple species with different lifetimes#

  • ignore_tracers option

  • reload H option