Pre-processing of the inputs#
1. Building a new domain#
The creation of a new domain is described in the CHIMERE tutorials, and is also possible with the following yaml. The parameters to set in the yaml are:
domid
the ID of your new domain
xmin
,xmax
,ymin
,ymax
the corners in lon/lat of the domain
dx
,dy
the spatial horizontal resolution of the domain, type for the unit
nlev
,p1
,pmax
the number of vertical levels, the reference bottom and top pressures
The domain is generated in the domain
folder of your workdir. It is organized as described in the documentation.
It is composed of a domainlst.nml
file with the main information about the domain, HCOORD
and VCOORD
folders, and a LANDUSE
folder (for debugging).
The HCOORD
folder contains two HCOORD_{domid}
and HCOORDcorner_{domid}
files that are the list of lon/lat of grid centers and grid corners respectively.
The VCOORD
folder contains a VCOORD_{nlev}_{p1}_{pmax}
file which is the list of sigma_a and sigma_b coefficients (sigma-hybrid pressure levels).
These coefficients are used to compute pressure levels in hPa (CHIMERE documentation, p.64), based on the surface pressure \(P_s\):
Warning
The units and values of the sigma_a and sigma_b coefficients must be checked to ensure the formula is correct.
Finally, the files must be copied in the reference domain directory defined in repgrid.
Warning
Check the $WORKDIR/LANDUSE/map_check.png
figure, especially at the boundaries (there can be nan values that would create trouble with meteo processing).
2. TROPOMI XCH4 observations#
There are three available TROPOMI retrieval products: the operational product developed by SRON, the BLENDED TROPOMI+GOSAT research product from Harvard University and the WFMD research product from University of Bremen.
The CIF includes dedicated obsparser
plugins to automatically preprocess the L2 inputs from the different products and to create standard monitor files.
The recommended quality filters are applied within the data processing. The variable format
(in datavect
) has to be set to the corresponding product:
CH4-RPRO
for the reprocessed operational product, CH4-WFMD
for the WFMD product and CH4-BLENDED
for the BLENDED product.
The resulting monitor files can be found in $WORKDIR/obs/CH4
.
Note
Observations can be aggregated into super-observations (at the spatial scale and the time-resolution of the model). It can decrease the memory usage and avoid memory overload in the case of too many observations.
3. Processing of other input data#
In the case of regional CH4 simulations, the chemistry of methane is neglected because of its long atmospheric lifetime (~9 to 10 years) in comparison to the timescale of how long a particle remains in the domain (~15 days). The following inputs are required:
a. Meteorological inputs#
Meteorological inputs are downloaded from the ECMWF datastore.
More details are available in the dedicated documentation.
In particular, the dpdd
, dpdu
, dped
, dpeu
variables are required only if deep convection is used.
b. CAMS background#
The background is composed of 3 elements: initial conditions, boundary conditions (lateral boundaries and top boundary) and stratosphere. Dedicated documentation is available here. The CHIMERE domain has a limited volume that contains horizontally the specified region and vertically the troposphere. Therefore, the system requires an initial CH4 concentration field and boundary conditions to compute the input/output fluxes at the sides and the top of the domain. It also requires the concentration field in the stratosphere to build the total column seen from the satellite (which measures the column over all the atmosphere, even if it is more sensitive to the troposphere). The domain is not extended to the stratosphere because the time scales are different in the tropo/strato: in the stratosphere jet streams with large wind speeds would require short time steps to accurately simulate transport, thus sharply increasing the cost of the simulation, even if CH4 particles are mainly transported in the troposphere.
The data is taken from the CAMS reanalysis GHG concentration product, based on surface+satellite observations (Agustí-Panareda et al., 2023). The data can be accessed here.
c. Prior CH4 emissions#
The prior fluxes are extracted from bottom-up inventories and emission models.
In the CIF-CHIMERE framework, prior flux inputs are AEMISSIONS.nc netcdf files, that have a standardized structure, with a time/lat/lon/grid.
All CH4 emissions are considered 2D (emitted at the ground level). As most inputs are monthly datasets, monthly AEMISSIONS files are used.
In the yaml below, all datasets and categories (each variable of each input dataset) have to be set in datavect
. Two transforms are necessary in controlvect
:
Summing the sub-categories into total emissions or into aggregated sectors
Dumping the files into netcdf format.
- The following variables have to be modified accordingly:
workdir
,datei
,datef
model
,domain
: the values are not used but are necessary for initializationcontrolvect
: the parameters to be summed (transformfamilies
). The parameters names have to be consistent betweencontrolvect
anddatavect
.datavect
: for each category, the dataset (dir
,file
,file_freq
), the variable (varname
,var_freq
), anunit_conversion
factor if required
Warning
Some yaml argument are not directly used but are necessary for initialization: transforms
in the obsoperator require model
, which requires a chemistry
.
platform
is necessary to load the modules to compile the model
.
In the yaml example available below, emissions are based on datasets of EDGARv8 (anthropogenic emissions), GFED for biomass burning, VISIT for wetlands and soil sinks,
and other reference GCP datasets for natural emissions (freshwaters, termites, geological emissions, oceans).
All (sub-)sectors are loaded as a variable in datavect
, then aggregated in the transforms of the controlvect
.
The resulting AEMISSIONS files can be found at the following location: $WORKDIR/fwd_0000/[date]/AEMISSIONS.YYYYMM01.nc
.
It is important that all AEMISSIONS.YYYYMM01.nc and AEMISSIONS.nc files are identical, as well as the AEMISSIONS.YYYYMM01.nc in several daily folders of the same month.
Thus, the AEMISSIONS files can be extracted using the following file for every year and month: YYYY-MM-01_00-00/AEMISSIONS.YYYYMM01.nc
.
Warning
For making AEMISSIONS sectoral files, it is necessary to run the script with a restriction to the input datasets that correspond to the sector.
d. Country/region mask [optional]#
A country or region mask can be used at different times of the inversion and the post-processing of the outputs. It is optional, yet recommended to prepare one before doing the inversion. It can be used during the inversion, to define the resolution of the control vector components: the fluxes can be optimized either at the scale of pixels, or at the scale of regions (required if the domain is large, because the computation of correlations can overload the memory). During the post-processing, it can be used to plot country/region averages and budgets.
The following script can be used to create the mask in a CIF-compatible format. The script requires a shapefile mask of regions as an input. It generates two files:
regions_[DOMAIN].nc
: composed of a single variable regions, which is a lat/lon array in which every region has a different integer value (0 for oceans, 1 for region 1, etc.). This file is to be used in the inversion yaml, in datavect/regions_infos.
mask_[DOMAIN].nc
: composed of one variable per region, which is a lat/lon binary array composed of 1 for pixels in the region, 0 for others. There is also a total variable equivalent to the variable in regions_[DOMAIN].nc. This file is to be used for plots.