Models (model)

Available Models (model)

The following models are implemented in pyCIF so far:

Description

The model class runs chemistry-transport models, process their outputs and generates their inputs. Please note that models are often computed with high-performance languages such as Fortran or C. In these case, the sources are included in the directory model_sources provided alongside pyCIF.

Required parameters, dependencies and functions

The following attributes, dependencies and functions should be defined for any model, as they are called by other plugins. They can be parameters to define at the set-up step, functions to implement in the corresponding module, or dependencies to be attached to the model class.

Parameters and attributes

Initialization parameters

The following attributes are defined once for all at the initialization of the model, they inform pyCIF about the temporal resolution of the model. All the following objects are filled with datetime.datetime objects. To make the handling of lists easier, pyCIF requires lists to be implemented as numpy.array

subsimu_dates

the list of simulation periods if the model simulation window is split into shorter sub-periods

tstep_dates

the time-steps at which the model carries out its numerical computations; this argument is used by pyCIF to determine which observation to compare to what model time step. The shape of this argument is a dictionary, whose keys are subsimu_dates and entries are the lists of time-steps corresponding to each sub-period.

tstep_all

the same as tstep_dates; the difference is that tstep_all is a list containing all time steps of all simulation sub-periods instead of a dictionary split into sub-periods

input_dates

dates at which the model expects some inputs; has the same shape as tstep_dates

Please find below an illustration of the different time steps:

digraph {
        tbl [

    shape=plaintext
    label=<

      <table border='0' cellborder='1' color='blue' cellspacing='0' width="500">
        <tr><td></td><td>1st sub-period</td><td>2nd sub-period</td></tr>

        <tr>
        <td>Global time scale</td>
        <td cellpadding='6'>
          <table color='orange' cellspacing='0' width="180" cellpadding="0">
            <tr><td width="30">1  </td><td width="30">2  </td><td width="30">3</td><td width="30">4  </td><td width="30">5  </td><td width="30">6</td></tr>
          </table>
        </td><td cellpadding='6'>
          <table color='orange' cellspacing='0' width="180" cellpadding="0">
            <tr><td width="30">7  </td><td width="30">8  </td><td width="30">9</td><td width="30">10  </td><td width="30">11  </td><td width="30">12</td></tr>
          </table>
        </td>
        </tr>
        <tr>
        <td>Local time scale</td>
        <td cellpadding='6'>
          <table color='orange' cellspacing='0' width="180" cellpadding="0">
            <tr><td width="30">1  </td><td width="30">2  </td><td width="30">3</td><td width="30">4  </td><td width="30">5  </td><td width="30">6</td></tr>
          </table>
        </td><td cellpadding='6'>
          <table color='orange' cellspacing='0' width="180" cellpadding="0">
            <tr><td width="30">1  </td><td width="30">2  </td><td width="30">3</td><td width="30">4  </td><td width="30">5  </td><td width="30">6</td></tr>
          </table>
        </td>
        </tr>

        <tr>
        <td>Observation 1</td>
        <td colspan="2" style="padding: 40px 10px 5px 5px;">
        | sampling period |
        </td>
        </tr>

        <tr>
        <td>Observation 2</td>
        <td colspan="2" style="padding: 40px 10px 5px 5px;">
                                               | sampling period |
        </td>
        </tr>

      </table>


    >];
    }

In the example, the model is run between January 1st, 2010 to February 28th, 2010. Computations are carried out every hours and inputs are expected every 3 hours. In that case, the temporal variables are:

import numpy as np

subsimu_dates = np.array([datetime.datetime(2010, 1, 1), datetime.datetime(2010, 2, 1)])

tstep_dates = {
    datetime.datetime(2010, 1, 1): np.array(
        [datetime.datetime(2010, 1, 1, 0), datetime.datetime(2010, 1, 1, 1),
         ..., datetime.datetime(2010, 1, 31, 23)]),
    datetime.datetime(2010, 2, 1): np.array(
        [datetime.datetime(2010, 2, 1, 0), datetime.datetime(2010, 2, 1, 1),
         ..., datetime.datetime(2010, 2, 28, 23)]),
}

tstep_all = np.array([
    datetime.datetime(2010, 1, 1, 0), datetime.datetime(2010, 1, 1, 1),
    ..., datetime.datetime(2010, 2, 28, 23)
])

input_dates = {
    datetime.datetime(2010, 1, 1): np.array(
        [datetime.datetime(2010, 1, 1, 0), datetime.datetime(2010, 1, 1, 3),
         ..., datetime.datetime(2010, 1, 31, 21)]),
    datetime.datetime(2010, 2, 1): np.array(
        [datetime.datetime(2010, 2, 1, 0), datetime.datetime(2010, 2, 1, 3),
         ..., datetime.datetime(2010, 2, 28, 21)]),
}

Online parameters

The following variables are defined online during the computation of the model.

chain

for a given model simulation, files from previous sub-periods necessary to run following sub-periods are stored in current_sim_directory/chain; the chain variable stores the date of the previous sub-period that was computed; the variable is automatically updated by the obsoperator, but the files should be moved by the function run of the model.

adj_refdir

this is the directory where forward simulations corresponding to the adjoint being run are stored; the variable should be updated when running a forward in the run function.

Dependencies

Some other classes in pyCIF expect the model class to have a domain class attached to it, describing the model domain. This way, model.domain can be called.

Functions

The following functions need to be implemented in any model to make it interact with other classes. They must be imported at the root level of the corresponding python package, i.e. in the __init__.py file:

from XXXXX import ini_periods
from XXXXX import run
from XXXXX import native2inputs
from XXXXX import native2inputs_adj
from XXXXX import outputs2native
from XXXXX import outputs2native_adj
from XXXXX import compile
from XXXXX import ini_mapper

It is recommended to include each function in a separate file to avoid very long scripts.

ini_periods (optional)

The function ini_periods is optional but very recommended. It is used to define the temporal variables subsimu_dates, input_dates, tstep_dates and tstep_all. The function is automatically called at the initialization of the model class if available. If not available, the temporal variables should be defined manually in the ini_data function (not recommended).

ini_periods is a class method that applies to the model plugin itself. Therefore, the only expected argument is self.

def ini_periods(self, **kwargs):

    self.subsimu_dates = XXXX
    self.tstep_dates = XXXXX
    self.input_dates = XXXXX
    self.tstep_all = XXXXX

Click below to see an example of the ini_periods function for the model CHIMERE.

pycif.plugins.models.chimere.ini_periods()[source]

run

The function run executes the model itself. As models are often computationally expensive to run, they are not written in python. Therefore, the run function calls an external executable compiled previously.

There are several ways to call system executables in python. We recommend using the function subprocess.Popen for that purpose. It gives flexibility in logging and can capture errors during the execution of the external executable.

Other tasks carried out by the run function are:

  • update the variable self.adj_refdir for later adjoint simulations

  • update the variable self.chain for later sub-periods and move necessary files to that directory;

    these files include for instance concentration fields at the last time step of the period, to be used as initial conditions for the next period.

Arguments are:

self

the model itself

runsubdir

the sub-directory where the sub-period needs to be run

mode

the running mode; one of fwd, tl or adj for forward, tangent-linear and adjoint respectively

workdir

the root working directory of the present CIF computation

do_simu

carry out or not the simulation; pyCIF can read previously computed simulations and skip the execution of the code; this behaviour should be specified

The functions returns nothing.

Example of code:

import subprocess
import os

def run(self, runsubdir, mode, workdir, do_simu=True, **kwargs):

    if not do_simu:
        if mode in ["fwd", "tl"]:
            self.adj_refdir = "{}/../".format(runsubdir)
        return

    with open("{}/log.std".format(runsubdir), "w") as log:
        process = subprocess.Popen(
            "run my executable",
            cwd=runsubdir,
            stdout=log,
            stderr=subprocess.PIPE
        )
        _, stderr = process.communicate()

    if stderr != "":
        print("Deal with exception in executable")

    if mode in ["fwd", "tl"]:
        # Adj_refdir is not the runsubdir itself (which corresponds to a sub-simulation
        # But the level above, which is the full chained simulation
        self.adj_refdir = "{}/../".format(runsubdir)

    # Now move necessary files to the chain directory
    os.system("mv -f {runsubdir}/XXXXX {runsubdir}/../chain/XXX"
              .format(runsubdir=runsubdir))

Click below for a full example of the run function for the model CHIMERE.

pycif.plugins.models.chimere.run()[source]

native2inputs and native2inputs_adj

The functions native2inputs and native2inputs_adj generate inputs for the model executable and reads sensitivity to the inputs as computed by the adjoint respectively.

outputs2native and outputs2native_adj

The functions outputs2native and outputs2native_adj read outputs and generate sensitivity to the outputs respectively.

ini_mapper

compile (optional)

flushrun (optional)

The function flushrun is called at the end of a simulations. It cleans all temporary files that take disk space and are not necessary afterwards.

Arguments are:

self

the model itself

rundir

the run directory (with all the sub-period simulations)

mode

the running mode; one of fwd, tl or adj.

The function returns nothing.

Click below for a full example of the flushrun function for the model CHIMERE.

pycif.plugins.models.chimere.flushrun()[source]