Modifying a model to use with EnSRF

In the following tutorial, one will learn how to modify a model plugin to make it work in batch mode with the EnSRF running mode.

Set-up a working case with independent sampling

The EnSRF methods rely on generating a random ensemble to approximately solve the inversion equations. There are two possible methods to:

  • batch_sampling: run all samples altogether into a single job; to do so, the control vector is perturbed using virtual species that are propagating all the way to the observation equivalents in the observation operator;

    Basically, for a case when CH4 emissions are optimized, the batch_sampling method will generate an ensemble (flux, CH4_1), …, (flux, CH4_N) of perturbed inputs. Then, perturbations will be propagated through all pyCIF operations, including the CTM. The outputs will be (concs, CH4_1), …, (concs, CH4_N). This will be re-composed into separated monitor.nc files for each samples.

  • parallel_sampling: the ensemble is generated by the master job; then slave jobs are run independently for each perturb control vector. And corresponding monitor.nc files are generated and processed by the master job

The parallel_sampling mode do not need any modification to work. One can test it directly with a working inversion test case.

Examples of inversion configuration files can be found :doc`here </yaml_examples/>`. Therein, one can look for EnSRF cases for the dummy model. General inversion configurations can be found in all sub-directories as adjtltest examples.

Implementing the batch_sampling for a specific model

Defining a correspondence between perturbed outputs and inputs

For a given configuration to be computed in batch mode, pyCIF perturbs backwards transforms along the transform pipe (see details here) from the observation vector backwards to the control vector and more generally to inputs.

For any parameter in the observation vector, e.g., (‘concs’, ‘CH4’), pyCIF will create virtual equivalents to be computed in parallel: (‘concs’, ‘CH4__sample#000’), (‘concs’, ‘CH4__sample#001’), …, (‘concs’, ‘CH4__sample#00N’). The mapper of every transform in the transform pipe is thus perturbed according to the perturbed inputs of the transform successors. To perturb inputs depending on perturbed outputs, there are two possibilities:

  1. inputs and outputs have the same name, thus perturbations are directly propagated

  2. if input names differ from output names, pyCIF needs extra information to now which inputs should propagate the perturbations. This is done by defining the variable outputs2inputs in the transform mapper, defined in the function ini_mapper. This is valid for any transform, and any model.

    The variable outputs2inputs is defined as a dictionary for which each entry is an unperturbed output, e.g., (‘concs’, ‘CH4’), and the values for each entry are the corresponding inputs to be perturbed.

    For instance, CHIMERE runs with the output (‘concs’, ‘CH4’) and the inputs (‘flux’, ‘CH4’), (‘inicond’, ‘CH4’), (‘latcond’, ‘CH4’), (‘topcond’, ‘CH4’), (‘meteo’, ‘’).

    To produce perturbed outputs, the model will need perturbed inputs for (‘flux’, ‘CH4’), (‘inicond’, ‘CH4’), (‘latcond’, ‘CH4’), (‘topcond’, ‘CH4’), but not for (‘meteo’, ‘’). Therefore, the variable outputs2inputs will look like:

    mapper["outputs2inputs"] = {
        ("concs", s): [("flux", s)]
        for s in model.chemistry.acspecies.attributes
    }
    

    Warning

    The variable outputs2inputs can have a complex behaviour in the case of complex chemical schemes, or when models are flexible in terms of needed inputs to transport a given species. Defining this variable thus need thorough though before starting. Do not hesitate to contact the core team to help you

Perturbing the behaviour of the model

To allow for the batch_sampling to work, pyCIF should be able to modify the behaviour of the model. A dedicated function perturb_model should be added to the model plugin to do so.

Usually, the perturb_model function should modify the underlying chemistry plugin to update, e.g., acspecies, emis_species, etc.

One can find an example of such a function here:

pycif.plugins.models.chimere.perturb_model(self, nsamples, transf_mapper)[source]