Response functions `response-functions/std`#

Description#

Tutorial: How to run response functions

Computes response functions based on a given observation operator, control vector and observation vector.

It explicitly computes the observation operator $\mathcal{H}(\mathbf{x})$, which is assumed to be linear by running so-called base functions or response functions.

To do so, it computes $\mathbf{y}_i = \mathcal{H}(\mathbf{x}_i)$ , $\forall\, 1 \leq i \leq \mathrm{dim}(\mathbf{x})$, where $\mathbf{x}_i$ is the control vector with nulled values, except the $i^\mathrm{th}$ element.

Response functions functions are computed as individual pyCIF simulations stored in $workdir/base_functions/

Note

The pyCIF process can be restarted if it stops because one or multiple response functions crash or do not produce the desired output. It will run again the response functions taht did not produce the desired output.

See the obsoperator plugin autorestart input argument for futher details about restarting pyCIF simulations.

Warning

As one simulation per dimension of the control vector is needed for this mode, please first check the dimension of your control vector and the time required for each simulation. You can check this by using the dryrun argument (see below)

Outputs#

Observation vector#

The full observation vector is obtained with $\mathbf{y} = \sum_i \mathbf{y}_i$ and is stored in $workdir/obsvect/

The observation vector dump column corresponding to the run_mode argument is filled with $\mathbf{y}$.

If the run_mode argument is set 'tl' (default) and a reference forward simulation is ran, the observation vector dump 'sim' column is filled with the observation vector from te reference forward simulation and the 'sim_tl' column is filled with $\mathbf{y}$.

$\mathbf{H}$ matrix#

The $\mathbf{H}$ matrix is obtained with $\mathbf{H} = \left(\mathbf{y}_1^\mathrm{T}, \, \dots, \, \mathbf{y}_N^\mathrm{T} \right)$ and is stored in $workdir/h_matrix.nc

The $\mathbf{H}$ matrix decomposition per control vector parameter is obtained by picking the $\mathbf{H}$ matrix lines corresponding to each parameter and reshaping the resulting sub-matrix with the parameter dimensions. The decompositions are stored in $workdir/base_functions/decomposition/

YAML arguments#

The following arguments are used to configure the plugin. pyCIF will return an exception at the initialization if mandatory arguments are not specified, or if any argument does not fit accepted values or type:

dryrun : bool, optional, default False: Create all response functions input files then stop. This option can be used to know the number of response functions

run_mode : “fwd” or “tl”, optional, default “tl”: Run mode of the response functions, if "tl" (tangent linear) is chosen and use_model_approximation is set to true, a forward reference will be runned

autoflush : bool, optional, default False: Flush temporary files that are not already flush by the model plugin flushrun method.

reload_results : bool, optional, default True: Reload response functions results from previous simulations. If set to true already computed simulations will not be run. Affect both the eventual reference forward simulation and the response functions simulations.

reload_h_matrix : “str or list of str”, optional: Reload the H matrix from previous simulations. If this argument is used, the computation of the response functions will be skipped and the H matrix will be read from the provided path(s). If multiple paths are provided, the H matrices will be summed.

clamp_h_matrix_to_zero : bool, optional, default True: Ensure all H matrix elements are greater than zero by clmamping them to zero.

analytical_inversion : bool, optional, default False: Do an analytical inversion with the H matrix build with the response function results

use_woodbury_identity : “bool or ‘auto’”, optional, default “auto”: Use Woodbury matrix identity to compute the inverse of $\left( \mathbf{R} + \mathbf{H}\mathbf{B}\mathbf{H}^T \right)$. Decreases computation time significantly when $\mathrm{dim}(\mathbf{R}) \gg \mathrm{dim}(\mathbf{B})$ (significantly increase the computation time otherwise). When this option is set to “auto” the method used is chosen according to $\mathbf{R}$ and $\mathbf{B}$ dimensions

full_period : bool, optional, default False: Run the response functions over the whole simulation windows. This argument cannot be set to true if the ‘spin_down’ arguments is used.

spin_down : str, optional: Spin-down of period of the response functions. Should be a valid pandas period alias (1D, 1M, …). Spin-down value can be set here globally for all response fonction or individual by setting a control vector option individially for tracers in the dat vector (the later option is prioritized when both are used).

first_period_only : bool, optional, default False: Only run the response funxtions that correspond to the first time period in the control vector. This option can be used to get the computing time required for running the response functions over one period or get the ‘relaxation’ time of one period. This option can not to be used with full_period = true

inicond_component : str, optional, default “inicond”: Initial conditions datavect component name

ignore_tracers : list, optional: List of datavect (component, parameter) couples to ignore. Ignored parameters corresponding response function will not be run and its outputs will be filled with zeros

job_batch_size : int, optional, default 20: Size of job batches to submit, wait for one batch to finish to submit the next one. If this option is set to zero or a negative number, all jobs will be submited at the same time (not recommended). When running jobs in a subprocess, setting this option to 1 will triger the cleaning of the temporary files after every job.

pseudo_parallel_job : bool, optional, default False

Run the job batches (of size job_batch_size) in “pseudo parallel mode, i.e. with a job file of the following format:

python -m pycif config_a.yaml &
python -m pycif config_b.yaml &
python -m pycif config_c.yaml &
wait

use_batch_sampling : bool, optional, default False: Group response functions per time periods and run them with the observation operator ‘batch_computation’ mode

batch_sampling_size : int, optional: Maximum size for the batch sampling batches

separate_parameters : bool, optional, default False: Separate response functions by observation parameters. This option can only be used with use_batch_sampling = True

independant_parameters : bool, optional, default False: If true parameters (species) are considered as “independant”, i.e. one response function will only affect the parameter of its control vector tracer and/or the parameters resulting of the control vector transformations (transform_pipe) taking the response function control vector tracer as input. This option can help reduce the number of samples actually present in batch sampling response function simulations. This option can only be used with use_batch_sampling = True

use_model_approximation : bool, optional, default False: Use the approximation of the model tangent operator for response functions. if use_model_approximation is set to true, run_mode must be set to "tl"

run_reference_forward : bool, optional, default False: Run a reference forward run and fill the observation vector y ('sim') field with the results. This option can not to be used with run_mode = 'fwd'

dump_sparse_arrays : bool, optional, default False: Use COOrdinates sparse arrays in dumped NetCDF files. The pycif.utils.sparse_array.to_dense_dataset function can be used to convert the sparse NetCDF files data to dense arrays

dump_obsvect_decompostion : bool, optional, default False: Dump observation vector decomposition by response function

Requirements#

The current plugin requires the present plugins to run properly:

Requirement name	Requirement type	Explicit definition	Any valid	Default name	Default version
platform	Platform	True	True	None	None
model	Model	False	True	None	None
obsoperator	ObsOperator	True	True	standard	std
obsvect	ObsVect	False	True	standard	std
controlvect	ControlVect	True	True	standard	std
datavect	DataVect	True	True	standard	std

YAML template#

Please find below a template for a YAML configuration:

mode:
  plugin:
    name: response-functions
    version: std
    type: mode

  # Optional arguments
  dryrun: XXXXX  # bool
  run_mode: XXXXX  # fwd|tl
  autoflush: XXXXX  # bool
  reload_results: XXXXX  # bool
  reload_h_matrix: XXXXX  # str or list of str
  clamp_h_matrix_to_zero: XXXXX  # bool
  analytical_inversion: XXXXX  # bool
  use_woodbury_identity: XXXXX  # bool or 'auto'
  full_period: XXXXX  # bool
  spin_down: XXXXX  # str
  first_period_only: XXXXX  # bool
  inicond_component: XXXXX  # str
  ignore_tracers: XXXXX  # list
  job_batch_size: XXXXX  # int
  pseudo_parallel_job: XXXXX  # bool
  use_batch_sampling: XXXXX  # bool
  batch_sampling_size: XXXXX  # int
  separate_parameters: XXXXX  # bool
  independant_parameters: XXXXX  # bool
  use_model_approximation: XXXXX  # bool
  run_reference_forward: XXXXX  # bool
  dump_sparse_arrays: XXXXX  # bool
  dump_obsvect_decompostion: XXXXX  # bool

Response functions response-functions/std

Contents

Response functions `response-functions/std`#

Description#

Outputs#

Observation vector#

\(\mathbf{H}\) matrix#

YAML arguments#

Requirements#

YAML template#

Response functions response-functions/std

Contents

Response functions response-functions/std#

Description#

Outputs#

Observation vector#

\(\mathbf{H}\) matrix#

YAML arguments#

Requirements#

YAML template#

Response functions `response-functions/std`#