Control vectors#
In pyCIF, the control vector \(\mathbf{x}\) is stored in two main different formats:
a pickle binary file directly usable by pyCIF
a NetCDF for the user
pickle binary file#
pyCIF saves intermediate control vectors at different steps of its execution:
before executing forward simulations
at the end of adjoint simulations in variational inversions and the test of the adjoint
at the very end of inversion modes to save the posterior control vector
The variables saved in the pickle are:
\(\mathbf{x}\): the current values of the control vector
\(\delta\mathbf{x}\): the adjoint sensitivities to the control vector
\(\mathbf{x}^\textrm{b}\): the prior (background) control vector
\(\boldsymbol{\sigma}^\textrm{b}\): diagonal of the background error covariance \(\mathbf{B}\)
\(\boldsymbol{\sigma}^\textrm{a}\): diagonal of the posterior error covariance \(\mathbf{P}^\textrm{a}\); only available when posterior uncertainties are computed
All these variables are dumped as one dimensional vectors and each term correspond to one element of the control vector (hence not necessarily at the pixel resolution).
Warning
The pickle format stores all variables as flat 1-D arrays with no geographic metadata (no lat/lon/time coordinates). It is not suitable for post-processing outside a pyCIF context. Use the NetCDF output for any analysis or visualisation.
NetCDF file#
It is possible to ask pyCIF to dump the control vector in a more user-friendly format as a NetCDF projected to physical horizontal, vertical and temporal resolutions.
The option to use is save_out_netcdf in the controlvect paragraph of the Yaml configuration file.
When dumping as a NetCDF, pyCIF will create a tree directory following the structure of your datavect.
You will find one directory per component. Therein, there will be one netCDF file per species.
Below is a simple example:
controlvect
├── fluxes
│ ├── controlvect_fluxes_CH4.nc
│ ├── controlvect_fluxes_CO2.nc
│ └── controlvect_fluxes_N2O.nc
├── biofluxes
│ ├── controlvect_biofluxes_CO2.nc
│ └── controlvect_biofluxes_N2O.nc
└── inicond
├── controlvect_inicond_CO2.nc
└── controlvect_inicond_CH4.nc
Only optimised variables (i.e., those declared with the hresol keyword in the YAML) are written.
In each netCDF file, the structure will be as follows:
netcdf controlvect_fluxes_CH4 {
dimensions:
time = 2 ;
lev = 1 ;
lat = 80 ;
lon = 100 ;
time_phys = 26 ;
latc = 81 ;
lonc = 101 ;
variables:
int64 time(time) ;
time:units = "days since 2019-01-01 00:00:00" ;
time:calendar = "proleptic_gregorian" ;
int64 lev(lev) ;
double x(time, lev, lat, lon) ;
x:_FillValue = NaN ;
double xb(time, lev, lat, lon) ;
xb:_FillValue = NaN ;
double b_std(time, lev, lat, lon) ;
b_std:_FillValue = NaN ;
int64 time_phys(time_phys) ;
time_phys:units = "hours since 2019-01-01 00:00:00" ;
time_phys:calendar = "proleptic_gregorian" ;
double x_phys(time_phys, lev, lat, lon) ;
x_phys:_FillValue = NaN ;
double xb_phys(time_phys, lev, lat, lon) ;
xb_phys:_FillValue = NaN ;
double b_phys(time_phys, lev, lat, lon) ;
b_phys:_FillValue = NaN ;
double latitudes(lat, lon) ;
latitudes:_FillValue = NaN ;
double latitudes_corner(latc, lonc) ;
latitudes_corner:_FillValue = NaN ;
double longitudes(lat, lon) ;
longitudes:_FillValue = NaN ;
double longitudes_corner(latc, lonc) ;
longitudes_corner:_FillValue = NaN ;
// global attributes:
:_NCProperties = "version=1|netcdflibversion=4.6.1|hdf5libversion=1.10.4" ;
}
The NetCDF variables and their meaning:
Variable |
Description |
|---|---|
|
Current (posterior) control vector at the inversion resolution
(scaling factor or physical value, depending on the |
|
Prior (background) control vector at the inversion resolution. |
|
Prior uncertainty (standard deviation, diagonal of \(\mathbf{B}^{1/2}\)) at the inversion resolution. |
|
Posterior control vector reprojected to the native physical
resolution of the input data. Equal to |
|
Prior reprojected to native physical resolution. |
|
Prior uncertainty reprojected to native physical resolution. |
The horizontal extent covers the full model domain.
The temporal resolution of variables without the _phys suffix matches
the control-vector resolution set by tresol / tsubresol in the YAML.
Variables with the _phys suffix use the finest resolution between the
control-vector and the native input data.
Dumping and loading control vectors#
Dump#
The control vector is dumped by the function :
- pycif.plugins.controlvects.standard.dump.dump(self, cntrl_file, to_netcdf=False, dir_netcdf=None, ensemble=False, **kwargs)[source]#
Dumps a control vector into a pickle file. Does not save large correlations.
- Parameters:
self (pycif.utils.classes.controlvects.ControlVect) – the Control Vector to dump
cntrl_file (str) – path to the file to dump as pickle
to_netcdf (bool) – save to netcdf files if True
dir_netcdf (str) – root path for the netcdf directory
Load#
The control vector is loaded by the function :