3. Elaborate the yaml for the CIF, using ready-made files#
Important
How to use the cheat-sheet for plugins
In the following, plugins have to be used and provided specifications. The arguments can be found in the documentation of each plugin. To make access to the plugins easier, the cheat-sheet shows them sorted by type: the various types are the left-most (e.g. chemistry, controlvect, fields). For each type, available plugins are listed with the name, version of each displayed. Note that stating the name and version of a plugin is mandatory, whereas stating its type not always necessary.
3.1. Section for PyCIF parameters:#
This section must contain the five arguments shown in the example:
verbose
gives the degree of verbosity of the CIF, with 1 for basic information and 2 for debugging
workdir
is the working directory, which will be created by the CIF and used for executing and storing all the relevant inputs and outputs. Chose somewhere with enough disk space.
logfile
is the name of the file where the logs of the CIF are written. This file is to be saved in theworkdir
.
datei
anddatef
are the initial and final dates of the period to simulate. Use the following compatible format for the date: YYYY-mm-dd or YYYY-mm-dd HH:mm:ss
#####################
# pyCIF config file #
#####################
# Define here all parameters for pyCIF following YAML syntax
# For details on YAML syntax, please see:
# http://docs.ansible.com/ansible/latest/YAMLSyntax.html
###############################################################################
# pyCIF parameters
verbose: 2
logfile: pycif.logtest
workdir: /tmp/CIF//.tox/py38/tmp/fwd_ref_chimere
datei: 2011-03-22 00:00:00
datef: 2011-03-22 09:00:00
In this section of the yaml, it is possible to define anchors to be used in the rest of the file.
3.2. Mode (mode
)#
Here, a forward simulation is the chosen mode for running the model.
At the key-word for the class (mode
), the various available plugins are listed in the cheat-sheet.
For the chosen plugin, here the one for running a forward simulation,
the name and version of the plugin are provided and the requirements are listed.
The full description of the class mode gives access to arguments.
For forward, there is no mandatory argument to specify but a few optional arguments can be used; the template yaml at the end of the page provides a full list of them.
In our example below, only reload_results
is used so as not to have to recompute the whole simulation in case of an interruption.
#####################
# pyCIF config file #
#####################
# Define here all parameters for pyCIF following YAML syntax
# For details on YAML syntax, please see:
# http://docs.ansible.com/ansible/latest/YAMLSyntax.html
###############################################################################
# pyCIF parameters
###############################################################################
# http://community-inversion.eu/documentation/plugins/modes/forward.html
mode:
plugin:
name: forward
version: std
reload_results: true
The requirements for our forward mode are Observation operator (obsoperator) and Control vector (controlvect). They are to be specified in the next sections of the yaml file.
3.3. Observation operator (obsoperator
)#
Our chemistry-transport model works from the flux space to the concentration space, which corresponds
to the standard choice of obsoperator.
For this standard obsoperator,
there is no mandatory argument to specify but a few optional arguments can be used,
as shown in the full template yaml.
In our example, autorestart
is used.
obsoperator:
plugin:
name: standard
version: std
autorestart: True
The requirements for our standard obsperator are controlvect
,
datavect
, model
, obsvect
and platform
.
3.4. Control vector (controlvect
)#
So far, there is only the standard possibility for controlvect. For this standard controlvect, there is no mandatory argument to specify but a few optional arguments can be used. In our example, no optional argument is activated (the default values will apply).
controlvect:
plugin:
name: standard
version: std
The requirements for the standard controlvect are datavect
,
domain
, model
and obsvect
.
3.5. Model (model
)#
Here, it is the plugin for CHIMERE. The usual user’s choices for running a CHIMERE simulation (see CHIMERE documentation) are either in the mandatory arguments or in the optional arguments, for which default values are specified. Be sure to check all the mandatory AND OPTIONAL arguments to fully set up the simulation as wanted. It must be consistent with e.g. the chemistry (see sections Locate the input files provided directly for CHIMERE and Chemistry (chemistry)) and domain (see sections Locate the input files provided directly for CHIMERE and Domain (domain)).
#####################
# pyCIF config file #
#####################
# Define here all parameters for pyCIF following YAML syntax
# For details on YAML syntax, please see:
# http://docs.ansible.com/ansible/latest/YAMLSyntax.html
###############################################################################
# pyCIF parameters
###############################################################################
# http://community-inversion.eu/documentation/plugins/models/chimere.html
model:
plugin:
name: CHIMERE
version: std
auto-recompile: true
dir_sources: /tmp/CIF//model_sources/chimereGES
direxec: /tmp/PYCIF_DATA_TEST/CHIMERE/CHIMERE_executables
ichemstep: 1
ideepconv: 0
nivout: 17
nlevemis: 17
nmdoms: 1
nphour_ref: 6
nzdoms: 1
periods: 3H
usechemistry: 1
usedepos: 1
usewetdepos: 1
The requirements for CHIMERE are domain
, chemistry
and a set of components
(corresponding to the inputs of CHIMERE itself) to be detailed in
datavect
: meteo
, flux
,
bioflux
, latcond
, topcond
and inicond
.
3.6. Observation Vector (obsvect
)#
To avoid useless runs, the CIF only runs a simulation up to the time where observations are available. The standard obsvect must therefore be initialized. See section Component for observations for how to provide quick-dummy observation data.
#####################
# pyCIF config file #
#####################
# Define here all parameters for pyCIF following YAML syntax
# For details on YAML syntax, please see:
# http://docs.ansible.com/ansible/latest/YAMLSyntax.html
###############################################################################
# pyCIF parameters
###############################################################################
# http://community-inversion.eu/documentation/plugins/obsvects/standard.html
obsvect:
plugin:
name: standard
version: std
dump: true
Its requirements are datavect
and the model
.
3.7. Platform (platform
)#
To specify the computing platform on which to run,
so that the CIF can chose the right configuration and perform targeted operations
e.g. module load
the relevant modules.
Here the example is set at LSCE, on the obelix cluster.
platform:
plugin:
name: LSCE
version: obelix
The only requirement is the model
.
3.8. Domain (domain
)#
Specify a domain for CHIMERE
(see also the cheat-sheet) consistently
with the pre-computed input files (see step 2).
The files defining the domain can be stored directly in directory repgrid
or symbolic links can be used.
domain :
plugin:
name : CHIMERE
version : std
repgrid: a_path_for_CHIMERE_COORD_definition_files/
domid : MYDOMAIN
nlev: 20
p1: 997
pmax: 200
pressure_unit: hPa
3.9. Chemistry (chemistry
)#
The only available type of chemical schemes so far is for photolysis with tabulated Js, the chemical scheme being pre-computed (see step 2).
chemistry :
plugin:
name: CHIMERE
version: gasJtab
schemeid: name.chemistry
dir_precomp: the_path_for_the_directory_of_which_chemical_scheme_named_above_is_a_subdir/
3.10. Data vector (datavect
)#
The data vector contains ingredients, which list the input data for the model (e.g. emission fluxes)
and for the comparison to observations (e.g. concentration data) ,
which controlvect, obsoperator
and obsvect
will use for building the set-up to run.
So far, there is only the standard datavect
.
datavect :
plugin:
name: stanbard
version: std
For the first forward simulation, its components are the requirements of the model’s plugin which are not already taken care of in the previous sections of the yaml (i.e. excluding domain
and chemistry
) plus a component for the (dummy) observation data.
3.10.1. CHIMERE usual inputs#
The components which are used to provide the model with its inputs are to be chosen among the available datastreams,
which are recognized by the model’s plugin
so that it is able to pre-process or simply fetch its inputs.
For example, for CHIMERE, the plugin expects flux
to provide the information on the input emissions
not to be interpolated within the hour i.e. emissions to put into the AEMISSIONS file,
For each of its datastream components, datavect
expects a minimum of three pieces of information:
a mandatory
dir
, for the directory where the data relevant to this component is availablea mandatory
file
which gives either a fixed file name or a general format for a set of files (with generic year, month, day, hour, etc).a mandatory (except for initial conditions)
file_freq
, which gives the time period covered by each data file. Use pandas format for these duration e.g. 1D, 120H, etc.
To distinguish the various boundary conditions (lateral, top, initial), the comp_type
must also be specified.
Note that these arguments are linked to datavect and not to a given plugin (see also optional general arguments for datavect here).
When a plugin is used by a component to deal with the specified files, its name, version AND type must be sepcified, as well as its own arguments.
The datastream components dealing with the required inputs of the model direct to plugins which are able to deal with the inputs required by CHIMERE: meteo (I) for the meteorological inputs, emission fluxes (II and III), boundary conditions (IV) and initial conditions (IV).
With pre-computed METEO.nc files, the specifications for the
meteo
component are very simple:
the minimum information
dir
andfile
which direct to ready-made METEO.nc files, as well as the matchingfile_freq
.the plugin for CHIMERE’s meteo ready-made files (see also cheat-sheet) which deals with these files
##################### # pyCIF config file # ##################### # Define here all parameters for pyCIF following YAML syntax # For details on YAML syntax, please see: # http://docs.ansible.com/ansible/latest/YAMLSyntax.html ############################################################################### # pyCIF parameters ############################################################################### datavect: components: meteo: plugin: name: CHIMERE version: std dir: /tmp/PYCIF_DATA_TEST/CHIMERE/ACADOK file: 'METEO.%Y%m%d%H.3.nc' file_freq: 3HIn this case, the names of the METEO.nc files match the format used by CHIMERE. It is also possible to vary the template e.g.
file: METEO_some_etiket.%Y%m%d%H.X.nc
With pre-computed files also available for fluxes, biofluxes, inicond, latcond and topcond, they can be specified in the same simple way.
Note
The pre-computed files must be consistent together and with the domain, the PyCIF parameters of the simulation and the choices made for the model, particularly with the optional argument
periods
(default 1D = 24 hours).In the same manner as for the meteorology, it it simple to specify the use of pre-computed AEMISSIONS.nc files with ready-made
dir
andfile
dealt with by the plugin for CHIMERE’s fluxes, with its type, name and version taken from the cheat-sheet:datavect: plugin: name: standard version: std components: meteo: dir: directory_containing_METEO.YYYYMMDDHH.*.nc_files file: METEO.%Y%m%d%H.X.nc plugin: name: CHIMERE version: std type: meteo file_freq: XH flux: dir: directory_containing_AEMISSIONS.YYYYMMDDHH.*.nc_files file: AEMISSIONS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: AEMISSIONS type: flux file_freq: XHIn this case, the CIF expects to find AEMISSIONS.nc files formatted exactly as CHIMERE uses them and containing all the species listed in ANTHROPIC.
It is also possible to combine different ready-made files for the various emitted species which do not require a sub-hourly interpolation. These species must be listed as
parameters
offlux
and their names must match the names in ANTHROPIC. For each parameter, it is possible to individualize everything, as shown in the tutorial for more elaborated inputs.The
bioflux
specifications follow the same principles asflux
. If no fluxes with a sub-hourly interpolation are required (useemisb
is False inmodel
), this component can be omitted. If it is omitted whileuseemisb
is True, an error is raised.In the same manner as for AEMISSIONS, it is simple to specify the use of pre-computed BEMISSIONS.nc files, making use of the plugin for CHIMERE’s fluxes:
datavect: plugin: name: standard version: std components: meteo: dir: directory_containing_METEO.YYYYMMDDHH.*.nc_files file: METEO.%Y%m%d%H.X.nc plugin: name: CHIMERE version: std type: meteo file_freq: XH flux: dir: directory_containing_AEMISSIONS.YYYYMMDDHH.*.nc_files file: AEMISSIONS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: AEMISSIONS type: flux file_freq: XH bioflux: dir: directory_containing_BEMISSIONS.YYYYMMDDHH.*.nc_files file: BEMISSIONS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: AEMISSIONS type: flux emis_type: bio file_freq: XHNote that the
emis_type
must here be explicit so that BEMISSIONS files are fetched (and not AEMISSIONS files). If combining various files for different parameters, their names must match BIOGENIC.The
inicond
,latcond
andtopcond
components are characterized by theircomp_type
; otherwise, their specifications follow the same principles asflux
.In the same manner as for AEMISSIONS, it is simple to specify the use of pre-computed INI_CONCS.nc files and BOUN_CONCS.nc files, using the plugin for CHIMERE’s fields:
datavect: plugin: name: standard version: std components: meteo: dir: directory_containing_METEO.YYYYMMDDHH.*.nc_files file: METEO.%Y%m%d%H.X.nc plugin: name: CHIMERE version: std type: meteo file_freq: XH flux: dir: directory_containing_AEMISSIONS.YYYYMMDDHH.*.nc_files file: AEMISSIONS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: AEMISSIONS type: flux file_freq: XH inicond: dir: directory_containing_IC.YYYYMMDDHH.*.nc_files file: INI_CONCS.0.nc XX mandatory name???XXXX plugin: name: CHIMERE version: icbc type: field comp_type: inicond latcond: dir: directory_containing_BC.YYYYMMDDHH.*.nc_files file: BOUN_CONCS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: icbc type: field file_freq: XH comp_type: latcond topcond: dir: directory_containing_BC.YYYYMMDDHH.*.nc_files file: BOUN_CONCS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: icbc type: field file_freq: XH comp_type: topcondIn this case, all species listed in ACTIVE_SPECIES are fetched in the same input files. To combine various pre-computed files, see the tutorial for more elaborated inputs.
3.10.2. Component for observations#
There are three options to compute CHIMERE outputs.
Force the computation of the observation operator without observations: XXX under-constructionwith force_full_operator?XXX
2. Generate random observations: this is done in the yaml by specifying information in the yaml to generate random surface measurements of a set of parameters with plugin measurements. For example, for one measured species only:
concs:
parameters:
S1:
plugin:
name: random
type: measurements
version: param
frequency: '1h'
nstations: 5
duration: '1h'
random_subperiod_shift: True
zmax: 100
seed: True
Make your own observation file
The component named concs
is used for surface data, other types are also available, as described in the standard data vector.
Here, an example is given for surface observations with the matching yaml and a python code to generate a monitor.nc file with one observation.
concs:
parameters:
S1:
dir: /tmp/PYCIF_DATA_TEST/CHIMERE/ACADOK
file: dummy_monitor.nc
The parameters’s names must be in the ACTIVE_SPECIES file.
To simply run a forward simulation, the dummy monitor file can be filled-in with one observation dated at the final hour of the period. This can be done based on the following short python script:
import pandas as pd
import datetime
# Put here the elements of datef
yearf = 2011
monthf = 2
dayf = 3
hourf = 0
minutef = 0
secondf = 0
# Put here the coordinates of any point in the domain (see repgrid in domain, file HCOORD)
lat0 = 1.2
lon0 = 48.3
list_basic_cols = [ 'date', 'duration', 'station' , 'network', 'parameter', 'lon', 'lat', 'obs', 'obserror', 'alt' ]
datef = datetime.datetime( year = yearf, month = monthf , day = dayf , hour = hourf, minute = minutef, second = secondf)
data = pd.DataFrame( columns = list_basic_cols )
data['date'] = [ datef ]
data = data.assign(duration=1.)
data = data.assign(station='dummy')
data = data.assign(network='none')
data = data.assign(parameter='NONE')
data = data.assign(lon = lon0)
data = data.assign(lat = lat0)
data = data.assign(obs = 500. )
data = data.assign(obserror = 5.)
data = data.assign(alt =1.)
data = data.to_xarray()
data.to_netcdf('monitor_obs_for_simu_ID.nc')
Warning
when no observation is available, no error is raised, the CIF indicates that the forward mode has been successfully executed - even though CHIMERE did not actually run.