4DVAR variational inversions 4dvar/std#

Description#

4D-Var variational inversion: iterative minimisation of the Bayesian cost function.

Mathematical framework#

The variational mode solves the Bayesian maximum a posteriori (MAP) problem by minimising a scalar cost function \(J\) with respect to a reduced-space vector \(\boldsymbol{\chi}\).

Cost function#

Working in the pre-conditioned chi-space defined by \(\mathbf{x} = \mathbf{x}_b + \mathbf{L}\boldsymbol{\chi}\) where \(\mathbf{L} = \mathbf{B}^{1/2}\) is the (right) square root of the background error covariance, the cost function reads:

\[J(\boldsymbol{\chi}) = \underbrace{\tfrac{1}{2}\,\boldsymbol{\chi}^\top \boldsymbol{\chi}}_{J_b} + \underbrace{\tfrac{1}{2} \bigl(\mathbf{y} - \mathcal{H}(\mathbf{x}_b + \mathbf{L}\boldsymbol{\chi})\bigr)^\top \mathbf{R}^{-1} \bigl(\mathbf{y} - \mathcal{H}(\mathbf{x}_b + \mathbf{L}\boldsymbol{\chi})\bigr)}_{J_o}\]

where \(\mathbf{y}\) are the observations and \(\mathbf{R}\) is the observation error covariance matrix.

Gradient#

The gradient of \(J\) with respect to \(\boldsymbol{\chi}\) is

\[\nabla_{\boldsymbol{\chi}} J = \boldsymbol{\chi} - \mathbf{L}^\top \mathbf{H}^\top \mathbf{R}^{-1} \bigl(\mathbf{y} - \mathcal{H}(\mathbf{x})\bigr)\]

The term \(\mathbf{H}^\top \mathbf{R}^{-1}\delta\mathbf{y}\) is computed by the adjoint of the observation operator (one backward model run), while \(\mathbf{L}^\top\) applies the adjoint of the square-root-B operator.

Minimisation#

An external minimizer plugin (e.g. M1QN3, L-BFGS-B) iterates:

  1. Evaluate \(J(\boldsymbol{\chi}_k)\) and \(\nabla J(\boldsymbol{\chi}_k)\) via one forward + one adjoint run.

  2. Compute a descent direction (quasi-Newton step).

  3. Line search to find \(\boldsymbol{\chi}_{k+1}\).

until the gradient norm falls below the convergence threshold.

Posterior state and diagnostics#

After convergence, the posterior state is recovered as:

\[\mathbf{x}_a = \mathbf{x}_b + \mathbf{L}\boldsymbol{\chi}_\mathrm{opt}\]

The chi-square diagnostic

\[\chi^2 = \frac{2\,J(\boldsymbol{\chi}_\mathrm{opt})}{m}\]

where \(m\) is the number of observations, should be close to 1 for a well-calibrated system (prior and observation errors consistent with the data-model mismatch).

Monte Carlo posterior uncertainty#

When montecarlo is configured, \(N\) independent inversions are run with perturbed prior states \(\mathbf{x}_b^{(k)} \sim \mathcal{N}(\mathbf{x}_b, \mathbf{B})\) and/or perturbed observations \(\mathbf{y}^{(k)} \sim \mathcal{N}(\mathbf{y}, \mathbf{R})\). The sample covariance of the resulting posterior ensemble estimates \(\mathbf{P}_a\):

\[\mathbf{P}_a \approx \frac{1}{N-1} \sum_{k=1}^{N} (\mathbf{x}_a^{(k)} - \bar{\mathbf{x}}_a) (\mathbf{x}_a^{(k)} - \bar{\mathbf{x}}_a)^\top\]

YAML arguments#

The following arguments are used to configure the plugin. pyCIF will return an exception at the initialization if mandatory arguments are not specified, or if any argument does not fit accepted values or type:

save_out_netcdf : bool, optional, default False

Save final posterior vector as NetCDF. This argument overwrites the corresponding argument in the controlvect.

montecarlo : optional

Number of perturbed inversions to compute posterior inversions.

Argument structure:
nsample : True, mandatory

Number of members in the Monte-Carlo. The control run with no perturbation will be computed as the last member of the ensemble.

perturb_x : bool, optional, default True

Perturb the control vector or not

seed_x : int, optional

Seed for the random generator for perturbing x.Useful to run the same perturbationsfor various cases

perturb_y : bool, optional, default True

Perturb the observation vector or not

seed_y : int, optional

Seed for the random generator for perturbing y.Useful to run the same perturbationsfor various cases

compute_reference : bool, optional, default True

Recompute the unperturbed member

aggregate_results : bool, optional, default False

Aggregate ensemble results and re-run prior/posterior cost functions

save_out_netcdf : bool, optional, default False

Save sample control vector as NetCDF. This argument overwrites the corresponding argument in the controlvect.

Requirements#

The current plugin requires the present plugins to run properly:

Requirement name

Requirement type

Explicit definition

Any valid

Default name

Default version

obsvect

ObsVect

False

True

standard

std

controlvect

ControlVect

True

True

standard

std

obsoperator

ObsOperator

True

True

standard

std

minimizer

Minimizer

True

True

M1QN3

std

simulator

Simulator

True

True

gausscost

std

platform

Platform

True

True

None

None

YAML template#

Please find below a template for a YAML configuration:

 1mode:
 2  plugin:
 3    name: 4dvar
 4    version: std
 5    type: mode
 6
 7  # Optional arguments
 8  save_out_netcdf: XXXXX  # bool
 9  montecarlo:
10    nsample: XXXXX  # True
11    perturb_x: XXXXX  # bool
12    seed_x: XXXXX  # int
13    perturb_y: XXXXX  # bool
14    seed_y: XXXXX  # int
15    compute_reference: XXXXX  # bool
16    aggregate_results: XXXXX  # bool
17    save_out_netcdf: XXXXX  # bool