Pre-processing of the inputs#

1. Building a new domain#

The creation of a new domain is described in the CHIMERE tutorials, and is also possible with the following yaml. The parameters to set in the yaml are:

domid the ID of your new domain

xmin, xmax, ymin, ymax the corners in lon/lat of the domain

dx, dy the spatial horizontal resolution of the domain, type for the unit

nlev, p1, pmax the number of vertical levels, the reference bottom and top pressures

The domain is generated in the domain folder of your workdir. It is organized as described in the documentation. It is composed of a domainlst.nml file with the main information about the domain, HCOORD and VCOORD folders, and a LANDUSE folder (for debugging).

The HCOORD folder contains two HCOORD_{domid} and HCOORDcorner_{domid} files that are the list of lon/lat of grid centers and grid corners respectively.

The VCOORD folder contains a VCOORD_{nlev}_{p1}_{pmax} file which is the list of sigma_a and sigma_b coefficients (sigma-hybrid pressure levels). These coefficients are used to compute pressure levels in hPa (CHIMERE documentation, p.64), based on the surface pressure $P_s$:

(1)#\[\begin{equation} P = a \times 1000 + b \times P_s \end{equation}\]

Warning

The units and values of the sigma_a and sigma_b coefficients must be checked to ensure the formula is correct.

Finally, the files must be copied in the reference domain directory defined in repgrid.

Warning

Check the $WORKDIR/LANDUSE/map_check.png figure, especially at the boundaries (there can be nan values that would create trouble with meteo processing).

2. TROPOMI XCH4 observations#

There are three available TROPOMI retrieval products: the operational product developed by SRON, the BLENDED TROPOMI+GOSAT research product from Harvard University and the WFMD research product from University of Bremen.

The CIF includes dedicated obsparser plugins to automatically preprocess the L2 inputs from the different products and to create standard monitor files. The recommended quality filters are applied within the data processing. The variable format (in datavect) has to be set to the corresponding product: CH4-RPRO for the reprocessed operational product, CH4-WFMD for the WFMD product and CH4-BLENDED for the BLENDED product. The resulting monitor files can be found in $WORKDIR/obs/CH4.

Note

Observations can be aggregated into super-observations (at the spatial scale and the time-resolution of the model). It can decrease the memory usage and avoid memory overload in the case of too many observations.

3. Processing of other input data#

In the case of regional CH4 simulations, the chemistry of methane is neglected because of its long atmospheric lifetime (~9 to 10 years) in comparison to the timescale of how long a particle remains in the domain (~15 days). The following inputs are required:

a. Meteorological inputs#

Meteorological inputs are downloaded from the ECMWF datastore. More details are available in the dedicated documentation. In particular, the dpdd, dpdu, dped, dpeu variables are required only if deep convection is used.

b. CAMS background#

The background is composed of 3 elements: initial conditions, boundary conditions (lateral boundaries and top boundary) and stratosphere. Dedicated documentation is available here. The CHIMERE domain has a limited volume that contains horizontally the specified region and vertically the troposphere. Therefore, the system requires an initial CH4 concentration field and boundary conditions to compute the input/output fluxes at the sides and the top of the domain. It also requires the concentration field in the stratosphere to build the total column seen from the satellite (which measures the column over all the atmosphere, even if it is more sensitive to the troposphere). The domain is not extended to the stratosphere because the time scales are different in the tropo/strato: in the stratosphere jet streams with large wind speeds would require short time steps to accurately simulate transport, thus sharply increasing the cost of the simulation, even if CH4 particles are mainly transported in the troposphere.

The data is taken from the CAMS reanalysis GHG concentration product, based on surface+satellite observations (Agustí-Panareda et al., 2023). The data can be accessed here.

c. Prior CH4 emissions#

The prior fluxes are extracted from bottom-up inventories and emission models. In the CIF-CHIMERE framework, prior flux inputs are AEMISSIONS.nc netcdf files, that have a standardized structure, with a time/lat/lon/grid. All CH4 emissions are considered 2D (emitted at the ground level). As most inputs are monthly datasets, monthly AEMISSIONS files are used. In the yaml below, all datasets and categories (each variable of each input dataset) have to be set in datavect. Two transforms are necessary in controlvect:

Summing the sub-categories into total emissions or into aggregated sectors

Dumping the files into netcdf format.

The following variables have to be modified accordingly:

workdir, datei, datef
model , domain: the values are not used but are necessary for initialization
controlvect: the parameters to be summed (transform families). The parameters names have to be consistent between controlvect and datavect.
datavect: for each category, the dataset (dir, file, file_freq), the variable (varname, var_freq), an unit_conversion factor if required

Warning

Some yaml argument are not directly used but are necessary for initialization: transforms in the obsoperator require model, which requires a chemistry. platform is necessary to load the modules to compile the model.

In the yaml example available below, emissions are based on datasets of EDGARv8 (anthropogenic emissions), GFED for biomass burning, VISIT for wetlands and soil sinks, and other reference GCP datasets for natural emissions (freshwaters, termites, geological emissions, oceans). All (sub-)sectors are loaded as a variable in datavect, then aggregated in the transforms of the controlvect.

The resulting AEMISSIONS files can be found at the following location: $WORKDIR/fwd_0000/[date]/AEMISSIONS.YYYYMM01.nc. It is important that all AEMISSIONS.YYYYMM01.nc and AEMISSIONS.nc files are identical, as well as the AEMISSIONS.YYYYMM01.nc in several daily folders of the same month. Thus, the AEMISSIONS files can be extracted using the following file for every year and month: YYYY-MM-01_00-00/AEMISSIONS.YYYYMM01.nc.

Warning

For making AEMISSIONS sectoral files, it is necessary to run the script with a restriction to the input datasets that correspond to the sector.

d. Country/region mask [optional]#

A country or region mask can be used at different times of the inversion and the post-processing of the outputs. It is optional, yet recommended to prepare one before doing the inversion. It can be used during the inversion, to define the resolution of the control vector components: the fluxes can be optimized either at the scale of pixels, or at the scale of regions (required if the domain is large, because the computation of correlations can overload the memory). During the post-processing, it can be used to plot country/region averages and budgets.

The following script can be used to create the mask in a CIF-compatible format. The script requires a shapefile mask of regions as an input. It generates two files:

regions_[DOMAIN].nc: composed of a single variable regions, which is a lat/lon array in which every region has a different integer value (0 for oceans, 1 for region 1, etc.). This file is to be used in the inversion yaml, in datavect/regions_infos.

mask_[DOMAIN].nc: composed of one variable per region, which is a lat/lon binary array composed of 1 for pixels in the region, 0 for others. There is also a total variable equivalent to the variable in regions_[DOMAIN].nc. This file is to be used for plots.

Pre-processing of the inputs

Contents

Pre-processing of the inputs#

1. Building a new domain#

2. TROPOMI XCH4 observations#

3. Processing of other input data#

a. Meteorological inputs#

b. CAMS background#

c. Prior CH4 emissions#

d. Country/region mask [optional]#