3. Elaborate the YAML for the CIF, using ready-made files#
Important
How to use the cheat-sheet for plugins
The following sections require using plugins and providing their specifications. Arguments for each plugin are documented on its individual documentation page. To make finding plugins easier, the cheat-sheet organizes them by type (the leftmost column, e.g. chemistry, controlvect, fields). For each type, available plugins are listed with their name and version. Specifying a plugin’s name and version is mandatory; specifying its type is not always necessary.
3.1. Section for PyCIF parameters:#
This section must contain the five arguments shown in the example:
verbosecontrols the verbosity level: 1 for basic information, 2 for debugging
workdiris the working directory. The CIF will create it and use it to run the simulation and store all inputs and outputs. Choose a location with sufficient disk space.
logfileis the name of the log file written by the CIF, saved inworkdir.
dateianddatefare the start and end dates of the simulation period. Accepted formats:YYYY-mm-ddorYYYY-mm-dd HH:mm:ss.
#####################
# pyCIF config file #
#####################
# Define here all parameters for pyCIF following YAML syntax
# For details on YAML syntax, please see:
# http://docs.ansible.com/ansible/latest/YAMLSyntax.html
###############################################################################
# pyCIF parameters
verbose: 2
logfile: pycif.logtest
workdir: /tmp/CIF//.tox/py38/tmp/fwd_ref_chimere
datei: 2011-03-22 00:00:00
datef: 2011-03-22 09:00:00
This section of the YAML can also define anchors for use elsewhere in the file.
3.2. Mode (mode)#
Here, a forward simulation is the chosen mode for running the model.
Available plugins for the mode class are listed in the cheat-sheet.
For the chosen plugin (forward simulation), provide its name and version, and note its requirements.
The full description of the mode class lists all available arguments.
For forward, no mandatory argument is required, but several optional arguments are available; the template YAML at the end of that page provides a complete list.
In the example below, only reload_results is used, to avoid recomputing the whole simulation after an interruption.
#####################
# pyCIF config file #
#####################
# Define here all parameters for pyCIF following YAML syntax
# For details on YAML syntax, please see:
# http://docs.ansible.com/ansible/latest/YAMLSyntax.html
###############################################################################
# pyCIF parameters
###############################################################################
# http://community-inversion.eu/documentation/plugins/modes/forward.html
mode:
plugin:
name: forward
version: std
reload_results: true
The requirements for our forward mode are Observation operator (obsoperator) and Control vector (controlvect). They are to be specified in the next sections of the YAML file.
3.3. Observation operator (obsoperator)#
Our chemistry-transport model maps from flux space to concentration space, which corresponds
to the standard choice of obsoperator.
For this standard obsoperator,
no mandatory argument is required, but several optional arguments are available,
as shown in the full template YAML.
In our example, autorestart is used.
obsoperator:
plugin:
name: standard
version: std
autorestart: True
The requirements for the standard obsoperator are controlvect,
datavect, model, obsvect, and platform.
3.4. Control vector (controlvect)#
Currently, only the standard plugin is available for controlvect. For this standard controlvect, no mandatory argument is required, though several optional arguments are available. In this example, no optional argument is set, so default values apply.
controlvect:
plugin:
name: standard
version: std
The requirements for the standard controlvect are datavect,
domain, model and obsvect.
3.5. Model (model)#
Use the CHIMERE model plugin. The typical configuration choices for a CHIMERE simulation (see CHIMERE documentation) are split between mandatory and optional arguments; default values are provided for optional ones. Check both the mandatory and optional arguments to configure the simulation as intended. The model settings must be consistent with the chemistry (see sections Locate the input files for CHIMERE and Chemistry (chemistry)) and the domain (see sections Locate the input files for CHIMERE and Domain (domain)).
#####################
# pyCIF config file #
#####################
# Define here all parameters for pyCIF following YAML syntax
# For details on YAML syntax, please see:
# http://docs.ansible.com/ansible/latest/YAMLSyntax.html
###############################################################################
# pyCIF parameters
###############################################################################
# http://community-inversion.eu/documentation/plugins/models/chimere.html
model:
plugin:
name: CHIMERE
version: std
auto-recompile: true
dir_sources: /tmp/CIF//model_sources/chimereGES
direxec: /tmp/PYCIF_DATA_TEST/CHIMERE/CHIMERE_executables
ichemstep: 1
ideepconv: 0
nivout: 17
nlevemis: 17
nmdoms: 1
nphour_ref: 6
nzdoms: 1
periods: 3H
usechemistry: 1
usedepos: 1
usewetdepos: 1
The requirements for CHIMERE are domain, chemistry and a set of components
(corresponding to the inputs of CHIMERE itself) to be detailed in
datavect: meteo, flux,
bioflux, latcond, topcond and inicond.
3.6. Observation Vector (obsvect)#
To avoid unnecessary computations, the CIF only runs the simulation up to the time covered by available observations. The standard obsvect must therefore be initialized. See section Component for observations for how to provide minimal dummy observation data.
#####################
# pyCIF config file #
#####################
# Define here all parameters for pyCIF following YAML syntax
# For details on YAML syntax, please see:
# http://docs.ansible.com/ansible/latest/YAMLSyntax.html
###############################################################################
# pyCIF parameters
###############################################################################
# http://community-inversion.eu/documentation/plugins/obsvects/standard.html
obsvect:
plugin:
name: standard
version: std
dump: true
Its requirements are datavect and the model.
3.7. Platform (platform)#
Specify the computing platform on which the simulation will run,
so that the CIF can choose the right configuration and perform platform-specific operations
such as module load for the relevant modules.
In this example, the platform is set to LSCE on the obelix cluster.
platform:
plugin:
name: LSCE
version: obelix
The only requirement is the model.
3.8. Domain (domain)#
Specify a domain for CHIMERE
(see also the cheat-sheet) consistently
with the pre-computed input files (see step 2).
The files defining the domain can be stored directly in directory repgrid or symbolic links can be used.
domain :
plugin:
name : CHIMERE
version : std
repgrid: a_path_for_CHIMERE_COORD_definition_files/
domid : MYDOMAIN
nlev: 20
p1: 997
pmax: 200
pressure_unit: hPa
3.9. Chemistry (chemistry)#
The only available chemical scheme type is photolysis with tabulated Js, with the scheme pre-computed (see step 2).
chemistry :
plugin:
name: CHIMERE
version: gasJtab
schemeid: name.chemistry
dir_precomp: the_path_for_the_directory_of_which_chemical_scheme_named_above_is_a_subdir/
3.10. Data vector (datavect)#
The data vector contains the input data for the model (e.g. emission fluxes)
and for comparison to observations (e.g. concentration data),
used by controlvect, obsoperator, and obsvect to build the run configuration.
Currently, only the standard datavect plugin is available.
datavect :
plugin:
name: standard
version: std
For the first forward simulation, its components are the requirements of the model plugin not already covered in earlier YAML sections (i.e. excluding domain and chemistry), plus a component for the (dummy) observation data.
3.10.1. CHIMERE usual inputs#
The components that provide the model with its inputs must be chosen from the available datastreams
recognized by the model plugin, which uses them to pre-process or fetch its inputs.
For CHIMERE, the plugin expects flux to provide hourly-averaged emission data (i.e. data for the AEMISSIONS file).
For each datastream component, datavect requires at least three pieces of information:
dir: the directory where the component’s data files are located (mandatory)
file: either a fixed filename or a pattern for a set of files using date placeholders (year, month, day, hour, etc.) (mandatory)
file_freq: the time span covered by each file, in pandas duration format, e.g.1D,120H(mandatory, except for initial conditions)
To distinguish boundary condition types (lateral, top, initial), comp_type must also be specified.
These arguments belong to datavect, not to any particular plugin
(see also optional general arguments here).
When a plugin is used by a component to process the specified files, its name, version, and type must be specified, along with its own arguments.
The datastream components for model inputs point to plugins that handle the inputs required by CHIMERE: meteo (I) for meteorological inputs, emission fluxes (II and III), boundary conditions (IV), and initial conditions (IV).
With pre-computed METEO.nc files, specifying the
meteocomponent is straightforward:
provide the minimum information:
dirandfilepointing to ready-made METEO.nc files, and the matchingfile_freq.add the plugin for CHIMERE’s ready-made meteo files (see also the cheat-sheet)
##################### # pyCIF config file # ##################### # Define here all parameters for pyCIF following YAML syntax # For details on YAML syntax, please see: # http://docs.ansible.com/ansible/latest/YAMLSyntax.html ############################################################################### # pyCIF parameters ############################################################################### datavect: components: meteo: plugin: name: CHIMERE version: std dir: /tmp/PYCIF_DATA_TEST/CHIMERE/ACADOK file: 'METEO.%Y%m%d%H.3.nc' file_freq: 3HIn this case, METEO.nc filenames follow the format used by CHIMERE. A custom template is also possible, e.g.
file: METEO_some_etiket.%Y%m%d%H.X.nc.Pre-computed files for fluxes, biofluxes, inicond, latcond, and topcond are specified in the same way.
Note
All pre-computed files must be mutually consistent and consistent with the domain, the pyCIF parameters, and the model settings, in particular the optional argument
periods(default:1D= 24 hours).Following the same pattern as for meteorology, pre-computed AEMISSIONS.nc files are specified with
dirandfile, handled by the plugin for CHIMERE fluxes. Use the cheat-sheet to find the correct type, name, and version:datavect: plugin: name: standard version: std components: meteo: dir: directory_containing_METEO.YYYYMMDDHH.*.nc_files file: METEO.%Y%m%d%H.X.nc plugin: name: CHIMERE version: std type: meteo file_freq: XH flux: dir: directory_containing_AEMISSIONS.YYYYMMDDHH.*.nc_files file: AEMISSIONS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: AEMISSIONS type: flux file_freq: XHThe CIF expects AEMISSIONS.nc files in the exact format used by CHIMERE, containing all species listed in ANTHROPIC.
It is also possible to combine different ready-made files for species that do not require sub-hourly interpolation. These species must be listed as
parametersofflux, with names matching those in ANTHROPIC. Each parameter can be configured independently, as shown in the tutorial for more elaborated inputs.The
biofluxcomponent follows the same principles asflux. If no sub-hourly flux interpolation is needed (useemisbisFalseinmodel), this component can be omitted. If it is omitted whileuseemisbisTrue, an error is raised.Pre-computed BEMISSIONS.nc files are specified in the same way as AEMISSIONS, using the plugin for CHIMERE fluxes:
datavect: plugin: name: standard version: std components: meteo: dir: directory_containing_METEO.YYYYMMDDHH.*.nc_files file: METEO.%Y%m%d%H.X.nc plugin: name: CHIMERE version: std type: meteo file_freq: XH flux: dir: directory_containing_AEMISSIONS.YYYYMMDDHH.*.nc_files file: AEMISSIONS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: AEMISSIONS type: flux file_freq: XH bioflux: dir: directory_containing_BEMISSIONS.YYYYMMDDHH.*.nc_files file: BEMISSIONS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: AEMISSIONS type: flux emis_type: bio file_freq: XHNote that
emis_typemust be specified explicitly so that BEMISSIONS files are fetched instead of AEMISSIONS files. When combining files for different parameters, their names must match BIOGENIC.The
inicond,latcond, andtopcondcomponents are distinguished by theircomp_type; otherwise their specifications follow the same principles asflux.Pre-computed INI_CONCS.nc files and BOUN_CONCS.nc files are specified in the same way as AEMISSIONS, using the plugin for CHIMERE fields:
datavect: plugin: name: standard version: std components: meteo: dir: directory_containing_METEO.YYYYMMDDHH.*.nc_files file: METEO.%Y%m%d%H.X.nc plugin: name: CHIMERE version: std type: meteo file_freq: XH flux: dir: directory_containing_AEMISSIONS.YYYYMMDDHH.*.nc_files file: AEMISSIONS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: AEMISSIONS type: flux file_freq: XH inicond: dir: directory_containing_IC.YYYYMMDDHH.*.nc_files file: INI_CONCS.0.nc XX mandatory name???XXXX plugin: name: CHIMERE version: icbc type: field comp_type: inicond latcond: dir: directory_containing_BC.YYYYMMDDHH.*.nc_files file: BOUN_CONCS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: icbc type: field file_freq: XH comp_type: latcond topcond: dir: directory_containing_BC.YYYYMMDDHH.*.nc_files file: BOUN_CONCS.%Y%m%d%H.X.nc plugin: name: CHIMERE version: icbc type: field file_freq: XH comp_type: topcondIn this configuration, all species listed in ACTIVE_SPECIES are read from the same input files. To use different files for different species, see the tutorial for more elaborated inputs.
3.10.2. Component for observations#
There are three options for providing observation data to drive CHIMERE.
Force computation of the observation operator without real observations (under construction).
2. Generate random observations: specify random surface measurements for a set of parameters using the measurements plugin in the YAML. For example, for a single measured species:
concs:
parameters:
S1:
plugin:
name: random
type: measurements
version: param
frequency: '1h'
nstations: 5
duration: '1h'
random_subperiod_shift: True
zmax: 100
seed: True
Provide your own observation file.
The concs component is used for surface data; other types are available as described in the standard data vector.
Below is an example for surface observations, including the matching YAML snippet and a Python script to generate a minimal monitor.nc file.
concs:
parameters:
S1:
dir: /tmp/PYCIF_DATA_TEST/CHIMERE/ACADOK
file: dummy_monitor.nc
Parameter names must match entries in the ACTIVE_SPECIES file.
To simply run a forward simulation, the dummy monitor file needs only one observation at the final hour of the simulation period. The following short Python script generates such a file:
import pandas as pd
import datetime
# Put here the elements of datef
yearf = 2011
monthf = 2
dayf = 3
hourf = 0
minutef = 0
secondf = 0
# Put here the coordinates of any point in the domain (see repgrid in domain, file HCOORD)
lat0 = 1.2
lon0 = 48.3
list_basic_cols = [ 'date', 'duration', 'station' , 'network', 'parameter', 'lon', 'lat', 'obs', 'obserror', 'alt' ]
datef = datetime.datetime( year = yearf, month = monthf , day = dayf , hour = hourf, minute = minutef, second = secondf)
data = pd.DataFrame( columns = list_basic_cols )
data['date'] = [ datef ]
data = data.assign(duration=1.)
data = data.assign(station='dummy')
data = data.assign(network='none')
data = data.assign(parameter='NONE')
data = data.assign(lon = lon0)
data = data.assign(lat = lat0)
data = data.assign(obs = 500. )
data = data.assign(obserror = 5.)
data = data.assign(alt =1.)
data = data.to_xarray()
data.to_netcdf('monitor_obs_for_simu_ID.nc')
Warning
when no observation is available, no error is raised, the CIF indicates that the forward mode has been successfully executed - even though CHIMERE did not actually run.