.. _chimere-design-yaml-ready-made:

Elaborate the yaml for the CIF, using ready-made files
======================================================


.. important::

   **How to use the** :doc:`cheat-sheet</documentation/dependencies>` **for plugins**

   In the following, plugins have to be used and provided specifications.
   The arguments can be found in the documentation of each plugin.
   To make access to the plugins easier, the cheat-sheet shows them sorted by **type**:
   the various types are the left-most (e.g. chemistry, controlvect, fields).
   For each type, available plugins are listed with the **name, version** of each displayed.
   Note that stating the name and version of a plugin is mandatory, whereas stating its type not always necessary.

.. contents::
    :local:


Section for PyCIF parameters:
-----------------------------

This section must contain the five arguments shown in the example:

  - :bash:`verbose` gives the degree of verbosity of the CIF, with 1 for basic information and 2 for debugging
  - :bash:`workdir` is the working directory, which will be created by the CIF and used for executing and storing all the relevant inputs and outputs. Chose somewhere with enough disk space.
  - :bash:`logfile` is the name of the file where the logs of the CIF are written. This file is to be saved in the :bash:`workdir`.
  - :bash:`datei` and :bash:`datef` are the initial and final dates of the period to simulate. Use the following compatible format for the date: YYYY-mm-dd or YYYY-mm-dd HH:mm:ss


.. container:: toggle

  .. container:: header

      Show/Hide Code

  .. yml-block:: /yaml_examples/chimere/config_fwd_ref_chimere.yml
      :keys: verbose, logfile, workdir, datei, datef


In this section of the yaml, it is possible to define :doc:`anchors</documentation/paths>` to be used in the rest of the file.


.. include:: ../../first-simu/common/mode.rst

.. _obsoperator_chimere_tuto:
.. include:: ../../first-simu/common/obsoperator.rst

.. _controlvect_chimere_tuto:
.. include:: ../../first-simu/common/controlvect.rst

.. _yml_section_model_chimere_fwd:

Model (:bash:`model`)
---------------------

Here, it is the :doc:`plugin for CHIMERE</documentation/plugins/models/chimere>`.
The usual user's choices for running a CHIMERE simulation (see :doc:`CHIMERE documentation</documentation/doc-models/chimere/index>`)
are either in the mandatory arguments or in the optional arguments, for which default values are specified.
Be sure to check all the mandatory **AND OPTIONAL** arguments to fully set up the simulation as wanted.
It must be consistent with e.g. the chemistry (see sections :ref:`step_2_chimere_readymade` and :ref:`chimere_readymade_chemistry`) 
and domain (see sections :ref:`step_2_chimere_readymade` and :ref:`chimere_readymade_domain`).

.. container:: toggle

  .. container:: header

    Show/Hide Code

  .. yml-block:: /yaml_examples/chimere/config_fwd_ref_chimere.yml
    :keys: model


The requirements for CHIMERE are :bash:`domain`, :bash:`chemistry` and a set of components
(corresponding to the inputs of CHIMERE itself) to be detailed in
:bash:`datavect`: :bash:`meteo`, :bash:`flux`,
:bash:`bioflux`, :bash:`latcond`, :bash:`topcond` and :bash:`inicond`.


Observation Vector (:bash:`obsvect`)
-------------------------------------

To avoid useless runs, the CIF only runs
a simulation up to the time where observations are available.
The standard :doc:`obsvect</documentation/plugins/obsvects/index>`
must therefore be initialized. See section :ref:`chimere_readymade_dummy_obs` for how to provide 
quick-dummy observation data.

.. container:: toggle

  .. container:: header

    Show/Hide Code

  .. yml-block:: /yaml_examples/chimere/config_fwd_ref_chimere.yml
    :keys: obsvect

Its requirements are :bash:`datavect` and the :bash:`model`.


.. include:: ../../first-simu/common/platform.rst

.. _chimere_readymade_domain:

Domain (:bash:`domain`)
-------------------------------------

Specify :doc:`a domain for CHIMERE</documentation/plugins/domains/chimere>`
(see also the :doc:`cheat-sheet</documentation/dependencies>`) **consistently**
with the pre-computed input files (see step 2).
The files defining the domain can be stored directly in directory :bash:`repgrid` or symbolic links can be used.

.. container:: toggle

  .. container:: header

      Show/Hide Code

  .. code-block:: yaml

    domain :
      plugin:
        name    : CHIMERE
        version : std
      repgrid: a_path_for_CHIMERE_COORD_definition_files/
      domid : MYDOMAIN
      nlev: 20
      p1: 997
      pmax: 200
      pressure_unit: hPa

.. _chimere_readymade_chemistry:

Chemistry (:bash:`chemistry`)
-------------------------------------

The only available type of chemical schemes so far is for :doc:`photolysis with tabulated Js</documentation/dependencies>`,
the chemical scheme being pre-computed (see step 2).

.. container:: toggle

  .. container:: header

      Show/Hide Code

  .. code-block:: yaml

    chemistry :
      plugin:
        name: CHIMERE
        version: gasJtab
      schemeid: name.chemistry
      dir_precomp: the_path_for_the_directory_of_which_chemical_scheme_named_above_is_a_subdir/


Data vector (:bash:`datavect`)
-------------------------------------

The :doc:`data vector</documentation/dependencies>` contains ingredients, which list the input data for the model (e.g. emission fluxes)
and for the comparison to observations (e.g. concentration data) ,
which :bash:`controlvect, obsoperator` and :bash:`obsvect` will use for building the set-up to run.

So far, there is only the standard :bash:`datavect`.

.. container:: toggle

  .. container:: header

      Show/Hide Code

  .. code-block:: yaml

    datavect :
      plugin:
        name: stanbard
        version: std

For the first forward simulation, its components are the :doc:`requirements of the model's plugin</documentation/dependencies>` which are not already taken care of in the previous sections of the yaml (i.e. excluding :bash:`domain` and :bash:`chemistry`) plus a component for the (dummy) observation data.

.. _chimere-datavect-infos:

CHIMERE usual inputs
....................

The components which are used to provide the model with its inputs are to be chosen among the available :doc:`datastreams</documentation/plugins/datastreams/index>`, 
which are recognized by the model's plugin
so that it is able to pre-process or simply fetch its inputs.
For example, for CHIMERE, the plugin expects :bash:`flux` to provide the information on the input emissions
not to be interpolated within the hour i.e. emissions to put into the AEMISSIONS file,

For each of its datastream components, :bash:`datavect` expects a minimum of three pieces of information:

  i) a mandatory :bash:`dir`, for the directory where the data relevant to this component is available
  ii) a mandatory :bash:`file` which gives either a fixed file name or a general format for a set of files
      (with generic year, month, day, hour, etc).
  iii) a mandatory (except for initial conditions) :bash:`file_freq`, which gives the time period covered by each data file. Use pandas format for these duration e.g. 1D, 120H, etc.

To distinguish the various boundary conditions (lateral, top, initial), the :bash:`comp_type` must also be specified.

Note that these arguments are linked to datavect and not to a given plugin
(see also optional general arguments for datavect :doc:`here </documentation/plugins/datavects/standard>`).

When a plugin is used by a component to deal with the specified files,
its **name, version AND type** must be sepcified, as well as its own arguments.

The datastream components dealing with the required inputs of the model direct to plugins which are able to deal with
:doc:`the inputs required by CHIMERE</documentation/doc-models/chimere/generalinfo>`:
:doc:`meteo</documentation/doc-models/chimere/meteo>` (I) for the meteorological inputs,
:doc:`emission fluxes</documentation/doc-models/chimere/emissions>` (II and III),
:doc:`boundary conditions</documentation/doc-models/chimere/bouncond>` (IV)
and :doc:`initial conditions</documentation/doc-models/chimere/inicond>` (IV).

  I. With pre-computed
     :doc:`METEO.nc files</documentation/doc-models/chimere/meteo>`, the specifications
     for the :bash:`meteo` component are very simple:

       i) the minimum information :bash:`dir` and :bash:`file` which direct to ready-made METEO.nc files, as well as  the matching :bash:`file_freq`.      
       ii) the :doc:`plugin for CHIMERE's meteo ready-made files</documentation/plugins/datastreams/meteos/chimere_meteo>`
           (see also :doc:`cheat-sheet</documentation/dependencies>`) which deals with these files


     .. container:: toggle

       .. container:: header

           Show/Hide Code

       .. yml-block:: /yaml_examples/chimere/config_fwd_ref_chimere.yml
           :keys: datavect/components/meteo

     In this case, the names of the :doc:`METEO.nc files match the format used by
     CHIMERE</documentation/doc-models/chimere/meteo>`.
     It is also possible to vary the template e.g. :bash:`file: METEO_some_etiket.%Y%m%d%H.X.nc`


  II. With pre-computed files also available for fluxes, biofluxes, inicond, latcond and topcond,
      they can be specified in the same simple way.

      .. note::

        The pre-computed files must be consistent together and with the domain,
        the PyCIF parameters of the simulation and the choices made for the model,
        particularly with the optional argument :bash:`periods` (default 1D = 24 hours).

      In the same manner as for the meteorology, it it simple to specify the use of pre-computed
      :doc:`AEMISSIONS.nc files</documentation/doc-models/chimere/emissions>`
      with ready-made :bash:`dir` and :bash:`file` dealt with by the
      :doc:`plugin for CHIMERE's fluxes</documentation/plugins/fluxes/chimere>`, with its type,
      name and version taken from the :doc:`cheat-sheet</documentation/dependencies>`:

      .. container:: toggle

        .. container:: header

            Show/Hide Code

        .. code-block:: yaml

          datavect:
            plugin:
              name: standard
              version: std
            components:
              meteo:
                dir: directory_containing_METEO.YYYYMMDDHH.*.nc_files
                file: METEO.%Y%m%d%H.X.nc
                plugin:
                  name: CHIMERE
                  version: std
                  type: meteo
                file_freq: XH
              flux:
                dir: directory_containing_AEMISSIONS.YYYYMMDDHH.*.nc_files
                file: AEMISSIONS.%Y%m%d%H.X.nc
                plugin:
                  name: CHIMERE
                  version: AEMISSIONS
                  type: flux
                file_freq: XH

      In this case, the CIF expects to find
      :doc:`AEMISSIONS.nc files formatted exactly as CHIMERE uses them</documentation/doc-models/chimere/emissions>`
      and containing all the species listed in :doc:`ANTHROPIC</documentation/doc-models/chimere/chemistry>`.

      It is also possible to combine different ready-made files for the various emitted species
      which do not require a sub-hourly interpolation.
      These species must be listed as :bash:`parameters` of :bash:`flux`
      and their names must match the names in :doc:`ANTHROPIC</documentation/doc-models/chimere/chemistry>`.
      For each parameter, it is possible to individualize everything,
      as shown in :doc:`the tutorial for more elaborated inputs</usertutos/elaborated-inputs/chimere/index>`.


  III. The :bash:`bioflux` specifications follow the same principles as :bash:`flux`.
       If no fluxes with a sub-hourly interpolation are required (:bash:`useemisb` is False in :bash:`model`),
       this component can be omitted. If it is omitted while :bash:`useemisb` is True,
       an error is raised.

       In the same manner as for AEMISSIONS, it is simple to specify the use of
       pre-computed :doc:`BEMISSIONS.nc files</documentation/doc-models/chimere/emissions>`,
       making use of the :doc:`plugin for CHIMERE's fluxes</documentation/plugins/fluxes/chimere>`:

       .. container:: toggle

         .. container:: header

             Show/Hide Code

         .. code-block:: yaml
          
            datavect:
              plugin:
                name: standard
                version: std
              components:
               meteo:
                 dir: directory_containing_METEO.YYYYMMDDHH.*.nc_files
                 file: METEO.%Y%m%d%H.X.nc
                 plugin:
                   name: CHIMERE
                   version: std
                   type: meteo
                 file_freq: XH
               flux:
                 dir: directory_containing_AEMISSIONS.YYYYMMDDHH.*.nc_files
                 file: AEMISSIONS.%Y%m%d%H.X.nc
                 plugin:
                   name: CHIMERE
                   version: AEMISSIONS
                   type: flux
                 file_freq: XH
               bioflux:
                 dir: directory_containing_BEMISSIONS.YYYYMMDDHH.*.nc_files
                 file: BEMISSIONS.%Y%m%d%H.X.nc
                 plugin:
                  name: CHIMERE
                  version: AEMISSIONS
                  type: flux
                 emis_type: bio
                 file_freq: XH

       Note that the :bash:`emis_type` must here be explicit so that BEMISSIONS files are fetched (and not AEMISSIONS files). If combining various files for different parameters, their names must match :doc:`BIOGENIC</documentation/doc-models/chimere/chemistry>`.


  IV. The :bash:`inicond`, :bash:`latcond` and :bash:`topcond` components are characterized
      by their :bash:`comp_type`; otherwise, their specifications follow the same principles as :bash:`flux`.

      In the same manner as for AEMISSIONS, it is simple to specify the use of pre-computed
      :doc:`INI_CONCS.nc files</documentation/doc-models/chimere/inicond>` and
      :doc:`BOUN_CONCS.nc files</documentation/doc-models/chimere/bouncond>`, using the
      :doc:`plugin for CHIMERE's fields</documentation/plugins/datastreams/fields/chimere_icbc>`:

      .. container:: toggle

        .. container:: header

            Show/Hide Code

        .. code-block:: yaml

          datavect:
            plugin:
              name: standard
              version: std
            components:
              meteo:
                dir: directory_containing_METEO.YYYYMMDDHH.*.nc_files
                file: METEO.%Y%m%d%H.X.nc
                plugin:
                  name: CHIMERE
                  version: std
                  type: meteo
                file_freq: XH
              flux:
                dir: directory_containing_AEMISSIONS.YYYYMMDDHH.*.nc_files
                file: AEMISSIONS.%Y%m%d%H.X.nc
                plugin:
                  name: CHIMERE
                  version: AEMISSIONS
                  type: flux
                file_freq: XH
              inicond:
                dir: directory_containing_IC.YYYYMMDDHH.*.nc_files
                file: INI_CONCS.0.nc  XX mandatory name???XXXX
                plugin:
                  name: CHIMERE
                  version: icbc
                  type: field
                comp_type: inicond
              latcond:
                dir: directory_containing_BC.YYYYMMDDHH.*.nc_files
                file: BOUN_CONCS.%Y%m%d%H.X.nc
                plugin:
                  name: CHIMERE
                  version: icbc
                  type: field
                file_freq: XH
                comp_type: latcond
              topcond:
                dir: directory_containing_BC.YYYYMMDDHH.*.nc_files
                file: BOUN_CONCS.%Y%m%d%H.X.nc
                plugin:
                  name: CHIMERE
                  version: icbc
                  type: field
                file_freq: XH
                comp_type: topcond


      In this case, all species listed in :doc:`ACTIVE_SPECIES</documentation/doc-models/chimere/chemistry>`
      are fetched in the same input files. To combine various pre-computed files,
      see :doc:`the tutorial for more elaborated inputs</usertutos/elaborated-inputs/chimere/index>`.

.. _chimere_readymade_dummy_obs:

Component for observations
..........................

There are three options to compute CHIMERE outputs.

1. Force the computation of the observation operator without observations: XXX under-constructionwith force_full_operator?XXX

2. Generate random observations: this is done in the yaml by specifying information in the yaml to generate random surface measurements of a set of parameters with :doc:`plugin measurements</documentation/plugins/measurements/random_perparam>`. 
For example, for one measured species only:

.. container:: toggle

  .. container:: header
  
      Show/Hide Code


  .. code-block:: yaml

    concs:
      parameters:
        S1:
          plugin:
            name: random
            type: measurements
            version: param
          frequency: '1H'
          nstations: 5
          duration: '1H'
          random_subperiod_shift: True
          zmax: 100
          seed: True


3. Make your own observation file

The component named :bash:`concs` is used for surface data, other types are also available, as described in the :doc:`standard data vector</documentation/plugins/datavects/standard>`.
Here, an example is given for surface observations with the matching yaml and a python code to generate a monitor.nc file with one observation.

.. container:: toggle

  .. container:: header

      Show/Hide Code


  .. code-block:: yaml

    concs:
      parameters:
        S1:
          dir: /tmp/PYCIF_DATA_TEST/CHIMERE/ACADOK
          file: dummy_monitor.nc

The parameters's names must be in the ACTIVE_SPECIES file.

To simply run a forward simulation, the dummy monitor file can be filled-in with one observation dated
at the final hour of the period.
This can be done based on the following short python script:

.. container:: toggle

  .. container:: header

      Show/Hide Code


  .. code-block:: python

    import pandas as pd
    import datetime

    # Put here the elements of datef
    yearf = 2011
    monthf = 2
    dayf = 3
    hourf = 0
    minutef = 0
    secondf = 0

    # Put here the coordinates of any point in the domain (see repgrid in domain, file HCOORD)
    lat0 = 1.2
    lon0 = 48.3

    list_basic_cols = [ 'date', 'duration', 'station' , 'network', 'parameter', 'lon', 'lat', 'obs', 'obserror', 'alt' ]

    datef = datetime.datetime( year = yearf, month = monthf , day = dayf , hour = hourf, minute = minutef, second = secondf)

    data = pd.DataFrame( columns = list_basic_cols )

    data['date'] = [ datef ]

    data = data.assign(duration=1.)
    data = data.assign(station='dummy')
    data = data.assign(network='none')
    data = data.assign(parameter='NONE')
    data = data.assign(lon = lon0)
    data = data.assign(lat = lat0)
    data = data.assign(obs = 500. )
    data = data.assign(obserror = 5.)
    data = data.assign(alt =1.)

    data = data.to_xarray()
    data.to_netcdf('monitor_obs_for_simu_ID.nc')


  .. See also :doc:`First simulation with comparison to observations</usertutos/first-comp-to-obs/chimere>`.

.. warning:: when no observation is available, no error is raised, the CIF indicates that the forward mode has been successfully executed - even though CHIMERE did not actually run.