How to add a new type of data for boundary conditions to be processed by the CIF into a model’s inputs

How to add a new type of data for boundary conditions to be processed by the CIF into a model’s inputs#

  1. Have a yaml file ready with a simulation that works with known plugins.

    obsoperator:
        plugin:
            name: standard
            version: std
        onlyinit: True
    
  2. In directory plugins/fields, copy the directory containing the template for a BC plugin bc_plugin_template in the directory for your new plugin.

    Following the instructions for adding and registrering a new plugin, register your new plugin by providing the chosen name (and version) in __init__.py instead of the template’s ones.

    from .get_domain import get_domain
    from .fetch import fetch
    from .read import read
    from .write import write
    
    
    _name = "new_plugin_s_name"
    _version = "new_plugin_s_version"
    
  3. Modify the yaml file to use the new plugin: replace the known plugin name, type and version by yours, keeping and adapting the mandatory arguments XXIS THIS THE RIGHT WORD?XXX:
    • comp_type: the_same

    • dir: the_dir_where_the_new_data_is

    • file: name_of_the_files_with_the_new_data

    • file_freq: frequency_of_the_new_files

latcond:
 parameters:
    S1:
      plugin:
        name: BCs
        version: template
        type: fields
      comp_type: latcond
      dir: dir_with_original_files/
      file: file_with_new_fields_too_use_as_BCs
      file_freq: 1M # case of monthly files
  1. Run pycif with this yaml: the new plugin will simply perform what is in the template i.e. print some instructions on what you have to do where. The following codes must be developped in the places matching the instructions - and checked. To check that each new code works as intended, run the CIF with the yaml using the new plugin and with the same yaml but using a known plugin with print statements. The scripts have to be developped in this order:

    1. fetch.py to match the required files to the time intervals to cover.

      pycif.plugins.datastreams.fields.bc_plugin_template.fetch(ref_dir, ref_file, input_dates, target_dir, tracer=None, component=None)[source]

      Retrieves the required files according to the simulation and the data files available

      Args#

      • ref_dir: directory where the original files are found

      • ref_file: (template) name of the original files

      • input_dates: list of two dates: the beginning and end of the simulation

      • target_dir: directory where the links to the orginal files are created

      Returns#

      • list_dates: a dictionary in which each key leads to a list of intervals [date_beginning, date_end] so that each interval is covered by one value taken fom the matching file stored in list_files.

      • list_files: dictionary in which each key leads to a list of files so that the list of intervals are covered by the values provided in these files.

      Chosing the keys for both dictionary: the most efficient ways are to use either i) the dates at which the data files begin or ii) dates matching the typical use of this data. Example: if the data is typically used for generating BCs per day, use the dates of the days to simulate as keys. The idea is to avoid to list the same file in several keys because the read routine is called for each key.

      Examples for a simulation from 01-01-2001 00H00 to 01-02-2001 00H00 for which input BC files cover 24 hours at an hourly resolution: - data = annual data for 2001:

      • list_dates = { ‘01-01-2001’: [[01-01-2001 00H00, 31-12-2001 23H59]] }

      • list_files = { ‘01-01-2001’: [[yearly_data_file]] }

      • data = hourly data in daily files:
        • list_dates = { ‘01-01-2001 00H00’: [ [01-01-2001 00H00, 01-01-01-2001 01H00 ], [01-01-2001 01H00, 01-01-01-2001 02H00 ], [01-01-2001 02H00, 01-01-01-2001 03H00 ], … [01-01-2001 23H00, 01-02-01-2001 00H00 ]] }

        • list_files = { ‘01-01-2001 00H00’: [ daily_data_file_for_01/01/2001, daily_data_file_for_01/01/2001, daily_data_file_for_01/01/2001, … ] }

      Notes#

      • the information file_freq, provided in the yaml file and accessible through tracer.file_freq, is used here and only here.

      • the intervals listed in list_dates are used to perform the time interpolation. They must therefore be the smallest intervals during which the values are constant XXXmal dit?XX. Example: if time profiles are applied (see XX for option apply_profile and how to provide the profile data) to yearly data, the intervals must be the intervals obtained after applying the profiles (e.g. monthly, hourly) and not the whole year.

      • the decumulation of fields is taken care of in read

    2. get_domain.py to get the information on the original domain. WARNING: you do not need a get_domain.py script only if the available files are provided on a domain that can be and acutally is specified in the yaml file XXX CHECK with ANTOINE + LINK to tuto yaml XXX

    3. read.py to actually read the data.

    4. if your plugin is to be the default plugin used by a CTM, write.py to write the data in the right format for the model. If your plugin is used to read data which must be processed before being used by any CTM, no writing is required as it is done by calling to the write function of the default plugin in the last step of the chain of transformations (see native2inputs.py and called scripts in this in the models’ plugins)

XXXXXXX what about the input arguments? Ils demandent une partie dediee!?XXXXXXXXXX

  1. Document the new plugin:

    1. write all relevant information on the plugin in the documentation section, at the top of __init__.py:

      """
      Write here the README about the plugin.
      Example of relevant information: type of files treated, including format of names and shape of data, time resolution, and any specific treatment that prevents the plugin from working with another type of files.
      
      Use rst syntax since this README will be automatically displayed in the documentation
      
      """
      
      from .get_domain import get_domain
      from .fetch import fetch
      from .read import read
      from .write import write
      
      _name = "new_plugin_s_name"
      _version = "new_plugin_s_version"
      
      input_arguments = {
      
        "dummy_arg": {
          "doc": "document here the argument",
          "default": "let's say it's not mandatory",
          "accepted": str
        },
      }
      

      If relevant, do the same in fetch, get_domain, read, write.

    2. create the rst file that contains the automatic documentation in docs/source/documentation/plugins/fields/. Please provide it with a self-explaining name. Example for the template: file bc_template.rst reads

    .. role:: bash(code)
       :language: bash
    
    ########################
    Template plugin for BCs
    ########################
    
    .. automodule:: pycif.plugins.datastreams.fields.bc_plugin_template
    
    1. add the reference to the rst file in docs/source/documentation/plugins/fields/index.rst:

    ####################
    Fields
    ####################
    
    .. role:: bash(code)
       :language: bash
    
    
    Available Fields
    =========================
    
    The following :bash:`fields` are implemented in pyCIF:
    
    .. toctree::
    
         bc_template
         chimere_icbc
    
    1. built the documentation (make html in docs/) and check that the link to the new plugin appears in the documentation at file:///your_path/cif/docs/build/html/documentation/plugins/index.html and that the section “doc” of the input arguments is correctly displayed at file:///your_path/cif/docs/build/html/documentation/plugins/fields/the_new_plugin.html