Processing of satellite observations (satellites / std)

Description

The present plugin manages satellite observations. This includes:

  • applying averaging kernel formulae

  • unfolding satellite levels to individual observations to extract proper levels from models

  • vertically interpolating simulated values to averaging kernels’ pressure levels

  • optionally, for domains not extending to the top of the atmosphere, fetching stratospheric concentrations from another dataset

The satellites transform is triggered according to keywords in the yml file. In the datavect paragraph, one should format a satellites paragraph as follows:

#####################
# pyCIF config file #
#####################

# Define here all parameters for pyCIF following YAML syntax
# For details on YAML syntax, please see:
# http://docs.ansible.com/ansible/latest/YAMLSyntax.html

###############################################################################
# pyCIF parameters

###############################################################################

datavect:
  components:
    satellites:
      parameters:
        NO2:
          chosenlev: 0
          correct_pthick: false
          cropstrato: true
          dir: /tmp/PYCIF_DATA_TEST/CHIMERE/ACADOK
          extend_surf: false
          file: 'monitor_OMIQA4ECV_NO2_ACADOK.%Y%m%d.9H.nc'
          fill_strato: true
          formula: 2
          molmass: 10
          nchunks: 2
          pressure: Pa
    stratosphere:
      parameters:
        NO2:
          plugin:
            name: ECMWF
            type: fields
            version: grib2
          dir: '/tmp/PYCIF_DATA_TEST//RAW//ECMWF/europe160/%Y/%m/'
          file: 'macc.160europe.%Y%m%d0000%H.grb2'
          regrid:
            method: bilinear
          unit_conversion:
            scale: 1810375000.0
          varname: co2

The stratosphere paragraph is optional and is required only with the option fill_strato.

Note

The detailed expected satellite observation file format is explained here.

Yaml arguments

The following arguments are used to configure the plugin. pyCIF will return an exception at the initialization if mandatory arguments are not specified, or if any argument does not fit accepted values or type:

Mandatory arguments

formula: (mandatory)

Number of the formula to use to apply the averaging kernels. Here is the detail of each variable:

  • \(nlevsat\) the number of levels of the satellite,

  • \(y\) the equivalent of the satellite data,

  • \(y^s\) the simulated concentrations interpolated on these levels,

  • \(\Delta P_i\) the pressure thicknesses of these levels [remark: if the thicknesses are not provided directly, this implies that the pressures for the sides of the levels are provided, including the surface pressure for the bottom of the lowest level],

  • \(y^0_i\) the prior concentrations on these levels (prior profile),

  • \(ak_i\) the averaging kernels and if relevant,

  • \(chosenlev\) the number of the level of the chosen partial column.

Available formulae are:

accepted values:

  • 1: Formula 1: \(y= \frac{\sum_{i=1}^{nlevsat}y^s_i \Delta P_i ak_i}{\sum_{i=1}^{nlevsat}\Delta P_i ak_i}\)

  • 2: Formula 2: \(y= \sum_{i=1}^{nlevsat}y^s_i ak_i\)

  • 3: Formula 3: \(y= 10^{\left(\log y^0_{chosenlev}+\sum_{i=1}^{nlevsat}(\log y^s_i-\log y^0_i)ak_i\right)}\)

  • 4: Formula 4: \(y= \sum_{i=1}^{nlevsat}y^s_i ak_i 10^3\)

  • 5: Formula 5: \(y= \frac{y^0+\sum_{i=1}^{nlevsat}ak_i(y_{dry,i}^s.dryair_i-y^0_i)}{dryair_{tot}}\)

  • 8: Formula 8: \(y= \sum_{i=1}^{nlevsat} \left( ak_i(y_i^s - y^0_i) \right) + y^0_{chosenlev}\)

  • 10: Formula 10: \(y= \sum_{i=1}\{ [y^0_i +\ ( y^s_i- y^0_i \) ak_i\ ] pwgt\}\)

molmass: (mandatory)

If fill_strato is True and product is column, molar mass (in g) of the species whose field is read in the stratosphere files.

accepted type: float

Optional arguments

parameter: (optional)

Parameter name on which the transform works on

accepted type: str

component: (optional)

Component name on which the transform works on

accepted type: str

orig_parameter_plg: (optional)

Plugin object on which the transform works on

accepted type: Plugin

orig_component_plg: (optional)

Corresponding component object on which the transform works on

accepted type: Plugin

successor: (optional)

Name of the successor transform

accepted type: str

precursor: (optional)

Name of the precursor transform

accepted type: str

ignore_formula: (optional): False

Ignore the given formula number and directly use the individual options (see below):

  • log_space

  • precomputed_pwgt

  • use_prior

  • unit_scaling

  • normalize_columns

  • use_drycols

  • scale_dpressure

accepted type: bool

product: (optional): level

Type of product

accepted values:

  • level: Levels in ppb

  • column: Total column in molec.cm-2.

pressure: (optional): hPa

Unit for the pressure levels

accepted values:

  • hPa: hectoPascals

  • Pa: Pascals

vinterp_type: (optional): weight

Type of vertical interpolation

accepted values:

  • weight: pressure weighted interpolation

  • linear: linear interpolation between middles of cells

weights_nsubsteps: (optional): 20

Number of sub step for target levels to do the weighted interpolation. The smaller the step, the higher the precision

accepted type: float

nchunks: (optional): 50

Number of chunks for the application of averaging kernels. Averaging kernels are applied by chunks and not observation by observation to accelerate computation. Chunks should not be too small, neither too large. As a rule of thumb, chunks of a few hundreds to one thousand observations are working fine. Smaller chunks loose the advantage of chunk-based computations, while too big chunks can overload your memory. For very low number of observations, 1-2 chunks are sufficient;For big datasets, one should test different number (a few tens is typically recommended)

accepted type: int

cropstrato: (optional): False

Cropping stratospheric averaging kernels. All averaging kernels above the top of the model are excluded. Warning: for domain-limited domain, if cropstrato is False, the top-most value from the model is interpolated to the top of the atmosphere, potentially biasing results (not recommended)

accepted type: bool

fill_strato: (optional): False

Filling stratosphere from a global model (temporary implemetation)

accepted type: int

correct_pthick: (optional): False

Correct for the thickness of the column. Due to topography and surface pressure approximations and errors in the model, the total column of air is never exactly the same between observations and the model. For that reason, a correcting factor can be applied to scale the thickness of the model column to make it fit the observed column

accepted type: bool

chosenlev: (optional): -1

For formula type #3, level at which the equivalent of the observation are computed. Counting starts at 0.

accepted type: int

split_tropo_strato: (optional): False

Compute separately the contribution from the stratosphere and the troposphere for debugging purpose

accepted type: bool

unit_scaling: (optional): 1

Re-scale aks using a given scaling factor

accepted type: float

log_space: (optional): False

Apply averaging kernels in a log space

accepted type: bool

precomputed_pwgt: (optional): False

Use pre-computed pressure weights

accepted type: bool

use_prior: (optional): False

Use a prior profile

accepted type: bool

use_drycols: (optional): False

Use dry air columns to scale simulations

accepted type: bool

scale_dpressure: (optional): False

Scale simulated level by pressure thickness

accepted type: bool

normalize_columns: (optional): False

Normalize total columns according to aks and pressure weights, or drycols if use_drycols = True

accepted type: bool

level_based: (optional): False

Averaging kernel pressure levels are defined at the middle (True) of levels, hence with n values, or at the interface (‘False’), hence with n+1 values

accepted type: bool

force_dump_sim_aks: (optional): False

Force dump the full dataframe before applying aks for debugging purposes

accepted type: bool

Yaml template

Please find below a template for a Yaml configuration:

 1transform:
 2  plugin:
 3    name: satellites
 4    version: std
 5    type: transform
 6
 7  # Mandatory arguments
 8  formula: XXXXX  # 1|2|3|4|5|8|10
 9  molmass: XXXXX  # float
10
11  # Optional arguments
12  parameter: XXXXX  # str
13  component: XXXXX  # str
14  orig_parameter_plg: XXXXX  # Plugin
15  orig_component_plg: XXXXX  # Plugin
16  successor: XXXXX  # str
17  precursor: XXXXX  # str
18  ignore_formula: XXXXX  # bool
19  product: XXXXX  # level|column
20  pressure: XXXXX  # hPa|Pa
21  vinterp_type: XXXXX  # weight|linear
22  weights_nsubsteps: XXXXX  # float
23  nchunks: XXXXX  # int
24  cropstrato: XXXXX  # bool
25  fill_strato: XXXXX  # int
26  correct_pthick: XXXXX  # bool
27  chosenlev: XXXXX  # int
28  split_tropo_strato: XXXXX  # bool
29  unit_scaling: XXXXX  # float
30  log_space: XXXXX  # bool
31  precomputed_pwgt: XXXXX  # bool
32  use_prior: XXXXX  # bool
33  use_drycols: XXXXX  # bool
34  scale_dpressure: XXXXX  # bool
35  normalize_columns: XXXXX  # bool
36  level_based: XXXXX  # bool
37  force_dump_sim_aks: XXXXX  # bool