################################### Modifying a model to use with EnSRF ################################### .. role:: bash(code) :language: bash In the following tutorial, one will learn how to modify a model plugin to make it work in batch mode with the :doc:`EnSRF running mode `. Set-up a working case with independent sampling ================================================ The EnSRF methods rely on generating a random ensemble to approximately solve the inversion equations. There are two possible methods to: - `batch_sampling`: run all samples altogether into a single job; to do so, the control vector is perturbed using virtual species that are propagating all the way to the observation equivalents in the observation operator; Basically, for a case when CH4 emissions are optimized, the `batch_sampling` method will generate an ensemble `(flux, CH4_1)`, ..., `(flux, CH4_N)` of perturbed inputs. Then, perturbations will be propagated through all pyCIF operations, including the CTM. The outputs will be `(concs, CH4_1)`, ..., `(concs, CH4_N)`. This will be re-composed into separated `monitor.nc` files for each samples. - `parallel_sampling`: the ensemble is generated by the master job; then slave jobs are run independently for each perturb control vector. And corresponding `monitor.nc` files are generated and processed by the master job The `parallel_sampling` mode do not need any modification to work. One can test it directly with a working inversion test case. Examples of inversion configuration files can be found :doc:`here `. Therein, one can look for EnSRF cases for the `dummy` model. General inversion configurations can be found in all sub-directories as `adjtltest` examples. Implementing the `batch_sampling` for a specific model ====================================================== Defining a correspondence between perturbed outputs and inputs -------------------------------------------------------------- For a given configuration to be computed in batch mode, pyCIF perturbs backwards transforms along the transform pipe (see details :doc:`here `) from the observation vector backwards to the control vector and more generally to inputs. For any parameter in the observation vector, e.g., `('concs', 'CH4')`, pyCIF will create virtual equivalents to be computed in parallel: `('concs', 'CH4__sample#000')`, `('concs', 'CH4__sample#001')`, ..., `('concs', 'CH4__sample#00N')`. The mapper of every transform in the transform pipe is thus perturbed according to the perturbed inputs of the transform successors. To perturb inputs depending on perturbed outputs, there are two possibilities: 1. inputs and outputs have the same name, thus perturbations are directly propagated 2. if input names differ from output names, pyCIF needs extra information to know which inputs should propagate the perturbations. This is done by defining the variable `outputs2inputs` in the transform mapper, defined in the function `ini_mapper`. This is valid for any transform, and any model. The variable `outputs2inputs` is defined as a dictionary for which each entry is an unperturbed output, e.g., `('concs', 'CH4')`, and the values for each entry are the corresponding inputs to be perturbed. For instance, CHIMERE runs with the output `('concs', 'CH4')` and the inputs `('flux', 'CH4')`, `('inicond', 'CH4')`, `('latcond', 'CH4')`, `('topcond', 'CH4')`, `('meteo', '')`. To produce perturbed outputs, the model will need perturbed inputs for `('flux', 'CH4')`, `('inicond', 'CH4')`, `('latcond', 'CH4')`, `('topcond', 'CH4')`, but not for `('meteo', '')`. Therefore, the variable `outputs2inputs` will look like: .. code-block:: python mapper["outputs2inputs"] = { ("concs", s): [("flux", s)] for s in model.chemistry.acspecies.attributes } .. warning:: The variable `outputs2inputs` can have a complex behaviour in the case of complex chemical schemes, or when models are flexible in terms of needed inputs to transport a given species. Defining this variable thus need thorough though before starting. Do not hesitate to contact the core team to help you Perturbing the behaviour of the model ------------------------------------- To allow for the `batch_sampling` to work, pyCIF should be able to modify the behaviour of the model. A dedicated function `perturb_model` should be added to the model plugin to do so. Usually, the `perturb_model` function should modify the underlying `chemistry` plugin to update, e.g., `acspecies`, `emis_species`, etc. One can find an example of such a function here: .. module:: pycif.plugins.models.chimere :noindex: .. autofunction:: perturb_model :noindex: Creating a restart_inicond component -------------------------------------------------------------- At the beginning of a simulation, pyCIF typically reads a prescribed file with initial conditions for the transported species (`inicond`). With the EnSRF mode, the first cycle starts from these prescribed initial conditions. However, the subsequent cycles must restart from the posterior concentrations of the previous cycle. To this end, when starting the next cycles, pyCIF adds a component `restart_inicond` to the Yaml configuration file controlling the cycle. In addition, it adds a parameter called `ensrf_restart_file` providing the path to the restart file to use and it also removes any occurence of the `inicond` component. This `restart_inicond` component must be accepted by the model and processed the same way the restart files are processed for forward simulations. Some model routines must be slightly modified to ensure this: 1. In the function `ini_mapper`, the `restart_inicond` component must be added: .. code-block:: python if getattr(model, "ensrf_restart_file", False): for spec in model.chemistry.acspecies.attributes: mapper["inputs"][('restart_inicond', spec)] = { "input_dates": {model.datei: np.array([[model.datei, model.datei]])}, "domain": domain, "force_dump": True, "sparse_data": False, "break_fwd_onlyinit_pipe": False, "break_adj_onlyinit_pipe": False } else: for spec in model.chemistry.acspecies.attributes: mapper["inputs"][('inicond', spec)] = { "input_dates": {model.datei: np.array([[model.datei, model.datei]])}, "domain": domain, "force_dump": True, "sparse_data": False, "break_fwd_onlyinit_pipe": False, "break_adj_onlyinit_pipe": False } 2. In the function `native2inputs`, the `restart_inicond` component type must be properly handled. Note that the function to write is dependent on the model. Below, an example for the model ICON-ART is given: .. code-block:: python # Initial conditions from a restart file elif input_type == 'restart_inicond': for trid in datastore: dirorig = datastore[trid]["dirorig"] fileorig = datastore[trid]["fileorig"] source_file = os.path.join(dirorig, fileorig) target_file = "{}/restart_atm_DOM01.nc".format(runsubdir) path.link(source_file, target_file)