Analytical inversions analytic/std#
Description#
Direct (analytical) Bayesian inversion via explicit H-matrix construction.
Mathematical framework#
This mode performs a Best Linear Unbiased Estimator (BLUE) inversion under the Gaussian error assumption. It first assembles the observation operator matrix \(\mathbf{H} \in \mathbb{R}^{m \times n}\) explicitly, column by column, then solves for the posterior state analytically.
Step 1 — Building H#
The Jacobian matrix is constructed by running one forward simulation per control-vector dimension \(i\):
where \(\mathbf{e}_i\) is the \(i\)-th canonical basis vector (all zeros except element \(i\) set to 1). This exploits linearity of the operator:
Each simulation is submitted as an independent pyCIF forward run stored under
$workdir/base_functions/.
Step 2 — Analytical inversion (BLUE)#
Given the prior \(\mathbf{x}_b\) with background error covariance \(\mathbf{B} \in \mathbb{R}^{n \times n}\), observations \(\mathbf{y}\) with observation error covariance \(\mathbf{R} \in \mathbb{R}^{m \times m}\), the posterior (analysis) state is:
where the Kalman gain is
and the posterior error covariance is
Complexity and scalability#
The dominant cost is the \(n\) forward simulations required to build
\(\mathbf{H}\). The matrix inversion
\((\mathbf{R} + \mathbf{H}\mathbf{B}\mathbf{H}^\top)^{-1}\) is
\(\mathcal{O}(m^3)\) in the observation dimension — feasible for moderate
\(m\) but prohibitive for large observing systems. For large problems use
the variational (4dvar) or response-functions modes instead.
Warning
One forward simulation per control-vector dimension \(n\) is required.
Check \(n\) and the cost of a single forward run before launching.
Use the dryrun option to estimate the total wall-clock time without
committing to the full computation.
YAML arguments#
The following arguments are used to configure the plugin. pyCIF will return an exception at the initialization if mandatory arguments are not specified, or if any argument does not fit accepted values or type:
- dump_nc_base_control : bool, optional, default False
Save each Dirac control vector (base function input) as NetCDF for post-hoc inspection of what was actually run.
- dryrun : bool, optional, default False
Submit only the first base function to estimate the per-run cost, then stop without completing the full H matrix.
- sequential : bool, optional, default False
Wait for each job to finish before submitting the next. Useful when concurrent submissions are restricted (e.g. GPU queues).
- resp_func_only : bool, optional, default False
Does not run the inversion, only the response functions to build the H matrix
Requirements#
The current plugin requires the present plugins to run properly:
Requirement name |
Requirement type |
Explicit definition |
Any valid |
Default name |
Default version |
|---|---|---|---|---|---|
obsvect |
False |
True |
standard |
std |
|
controlvect |
True |
True |
standard |
std |
|
obsoperator |
True |
True |
standard |
std |
|
platform |
True |
True |
None |
None |
YAML template#
Please find below a template for a YAML configuration:
1mode:
2 plugin:
3 name: analytic
4 version: std
5 type: mode
6
7 # Optional arguments
8 dump_nc_base_control: XXXXX # bool
9 dryrun: XXXXX # bool
10 sequential: XXXXX # bool
11 resp_func_only: XXXXX # bool
See also