Response functions response-functions/std
#
Description#
Tutorial: How to run response functions
Computes response functions based on a given observation operator, control vector and observation vector.
It explicitly computes the observation operator \(\mathcal{H}(\mathbf{x})\), which is assumed to be linear by running so-called base functions or response functions.
To do so, it computes \(\mathbf{y}_i = \mathcal{H}(\mathbf{x}_i)\) , \(\forall\, 1 \leq i \leq \mathrm{dim}(\mathbf{x})\), where \(\mathbf{x}_i\) is the control vector with nulled values, except the \(i^\mathrm{th}\) element.
Response functions functions are computed as individual pyCIF simulations stored
in $workdir/base_functions/
Note
The pyCIF process can be restarted if it stops because one or multiple response functions crash or do not produce the desired output. It will run again the response functions taht did not produce the desired output.
See the obsoperator plugin
autorestart
input argument for futher details about restarting
pyCIF simulations.
Warning
As one simulation per dimension of the control vector is needed for this mode,
please first check the dimension of your control vector and the time required
for each simulation.
You can check this by using the dryrun
argument (see below)
Outputs#
Observation vector#
The full observation vector is obtained with \(\mathbf{y} = \sum_i \mathbf{y}_i\)
and is stored in $workdir/obsvect/
The observation vector dump column corresponding to the run_mode
argument
is filled with \(\mathbf{y}\).
If the run_mode
argument is set 'tl'
(default) and a reference
forward simulation is ran, the observation vector dump 'sim'
column
is filled with the observation vector from te reference forward simulation and
the 'sim_tl'
column is filled with \(\mathbf{y}\).
\(\mathbf{H}\) matrix#
The \(\mathbf{H}\) matrix is obtained with
\(\mathbf{H} = \left(\mathbf{y}_1^\mathrm{T}, \, \dots, \, \mathbf{y}_N^\mathrm{T} \right)\)
and is stored in $workdir/h_matrix.nc
The \(\mathbf{H}\) matrix decomposition per control vector parameter is
obtained by picking the \(\mathbf{H}\) matrix lines corresponding to each
parameter and reshaping the resulting sub-matrix with the parameter dimensions.
The decompositions are stored in $workdir/base_functions/decomposition/
YAML arguments#
The following arguments are used to configure the plugin. pyCIF will return an exception at the initialization if mandatory arguments are not specified, or if any argument does not fit accepted values or type:
- dryrun : bool, optional, default False
Create all response functions input files then stop. This option can be used to know the number of response functions
- run_mode : “fwd” or “tl”, optional, default “tl”
Run mode of the response functions, if
"tl"
(tangent linear) is chosen anduse_model_approximation
is set totrue
, a forward reference will be runned- autoflush : bool, optional, default False
Flush temporary files that are not already flush by the model plugin
flushrun
method.- reload_results : bool, optional, default True
Reload response functions results from previous simulations. If set to
true
already computed simulations will not be run. Affect both the eventual reference forward simulation and the response functions simulations.- reload_h_matrix : “str or list of str”, optional
Reload the H matrix from previous simulations. If this argument is used, the computation of the response functions will be skipped and the H matrix will be read from the provided path(s). If multiple paths are provided, the H matrices will be summed.
- clamp_h_matrix_to_zero : bool, optional, default True
Ensure all H matrix elements are greater than zero by clmamping them to zero.
- analytical_inversion : bool, optional, default False
Do an analytical inversion with the H matrix build with the response function results
- use_woodbury_identity : “bool or ‘auto’”, optional, default “auto”
Use Woodbury matrix identity to compute the inverse of \(\left( \mathbf{R} + \mathbf{H}\mathbf{B}\mathbf{H}^T \right)\). Decreases computation time significantly when \(\mathrm{dim}(\mathbf{R}) \gg \mathrm{dim}(\mathbf{B})\) (significantly increase the computation time otherwise). When this option is set to “auto” the method used is chosen according to \(\mathbf{R}\) and \(\mathbf{B}\) dimensions
- full_period : bool, optional, default False
Run the response functions over the whole simulation windows. This argument cannot be set to
true
if the ‘spin_down’ arguments is used.- spin_down : str, optional
Spin-down of period of the response functions. Should be a valid pandas period alias (1D, 1M, …). Spin-down value can be set here globally for all response fonction or individual by setting a control vector option individially for tracers in the dat vector (the later option is prioritized when both are used).
- first_period_only : bool, optional, default False
Only run the response funxtions that correspond to the first time period in the control vector. This option can be used to get the computing time required for running the response functions over one period or get the ‘relaxation’ time of one period. This option can not to be used with
full_period = true
- inicond_component : str, optional, default “inicond”
Initial conditions datavect component name
- ignore_tracers : list, optional
List of datavect (component, parameter) couples to ignore. Ignored parameters corresponding response function will not be run and its outputs will be filled with zeros
- job_batch_size : int, optional, default 20
Size of job batches to submit, wait for one batch to finish to submit the next one. If this option is set to zero or a negative number, all jobs will be submited at the same time (not recommended). When running jobs in a subprocess, setting this option to 1 will triger the cleaning of the temporary files after every job.
- pseudo_parallel_job : bool, optional, default False
Run the job batches (of size
job_batch_size
) in “pseudo parallel mode, i.e. with a job file of the following format:python -m pycif config_a.yaml & python -m pycif config_b.yaml & python -m pycif config_c.yaml & wait
- use_batch_sampling : bool, optional, default False
Group response functions per time periods and run them with the observation operator ‘batch_computation’ mode
- batch_sampling_size : int, optional
Maximum size for the batch sampling batches
- separate_parameters : bool, optional, default False
Separate response functions by observation parameters. This option can only be used with
use_batch_sampling = True
- independant_parameters : bool, optional, default False
If true parameters (species) are considered as “independant”, i.e. one response function will only affect the parameter of its control vector tracer and/or the parameters resulting of the control vector transformations (
transform_pipe
) taking the response function control vector tracer as input. This option can help reduce the number of samples actually present in batch sampling response function simulations. This option can only be used withuse_batch_sampling = True
- use_model_approximation : bool, optional, default False
Use the approximation of the model tangent operator for response functions. if
use_model_approximation
is set totrue
,run_mode
must be set to"tl"
- run_reference_forward : bool, optional, default False
Run a reference forward run and fill the observation vector
y
('sim'
) field with the results. This option can not to be used withrun_mode = 'fwd'
- dump_sparse_arrays : bool, optional, default False
Use COOrdinates sparse arrays in dumped NetCDF files. The
pycif.utils.sparse_array.to_dense_dataset
function can be used to convert the sparse NetCDF files data to dense arrays- dump_obsvect_decompostion : bool, optional, default False
Dump observation vector decomposition by response function
Requirements#
The current plugin requires the present plugins to run properly:
Requirement name |
Requirement type |
Explicit definition |
Any valid |
Default name |
Default version |
---|---|---|---|---|---|
platform |
True |
True |
None |
None |
|
model |
False |
True |
None |
None |
|
obsoperator |
True |
True |
standard |
std |
|
obsvect |
False |
True |
standard |
std |
|
controlvect |
True |
True |
standard |
std |
|
datavect |
True |
True |
standard |
std |
YAML template#
Please find below a template for a YAML configuration:
1mode:
2 plugin:
3 name: response-functions
4 version: std
5 type: mode
6
7 # Optional arguments
8 dryrun: XXXXX # bool
9 run_mode: XXXXX # fwd|tl
10 autoflush: XXXXX # bool
11 reload_results: XXXXX # bool
12 reload_h_matrix: XXXXX # str or list of str
13 clamp_h_matrix_to_zero: XXXXX # bool
14 analytical_inversion: XXXXX # bool
15 use_woodbury_identity: XXXXX # bool or 'auto'
16 full_period: XXXXX # bool
17 spin_down: XXXXX # str
18 first_period_only: XXXXX # bool
19 inicond_component: XXXXX # str
20 ignore_tracers: XXXXX # list
21 job_batch_size: XXXXX # int
22 pseudo_parallel_job: XXXXX # bool
23 use_batch_sampling: XXXXX # bool
24 batch_sampling_size: XXXXX # int
25 separate_parameters: XXXXX # bool
26 independant_parameters: XXXXX # bool
27 use_model_approximation: XXXXX # bool
28 run_reference_forward: XXXXX # bool
29 dump_sparse_arrays: XXXXX # bool
30 dump_obsvect_decompostion: XXXXX # bool