Control vectors#

In pyCIF, the control vector \(\mathbf{x}\) is stored in two main different formats:

  • a pickle binary file directly usable by pyCIF

  • a NetCDF for the user

pickle binary file#

pyCIF saves intermediate control vectors at different steps of its execution:

The variables saved in the pickle are:

  • \(\mathbf{x}\): the current values of the control vector

  • \(\delta\mathbf{x}\): the adjoint sensitivities to the control vector

  • \(\mathbf{x}^\textrm{b}\): the prior (background) control vector

  • \(\boldsymbol{\sigma}^\textrm{b}\): diagonal of the background error covariance \(\mathbf{B}\)

  • \(\boldsymbol{\sigma}^\textrm{a}\): diagonal of the posterior error covariance \(\mathbf{P}^\textrm{a}\); only available when posterior uncertainties are computed

All these variables are dumped as one dimensional vectors and each term correspond to one element of the control vector (hence not necessarily at the pixel resolution).

Warning

The pickle format stores all variables as flat 1-D arrays with no geographic metadata (no lat/lon/time coordinates). It is not suitable for post-processing outside a pyCIF context. Use the NetCDF output for any analysis or visualisation.

NetCDF file#

It is possible to ask pyCIF to dump the control vector in a more user-friendly format as a NetCDF projected to physical horizontal, vertical and temporal resolutions.

The option to use is save_out_netcdf in the controlvect paragraph of the Yaml configuration file.

When dumping as a NetCDF, pyCIF will create a tree directory following the structure of your datavect. You will find one directory per component. Therein, there will be one netCDF file per species. Below is a simple example:

controlvect
├── fluxes
│   ├── controlvect_fluxes_CH4.nc
│   ├── controlvect_fluxes_CO2.nc
│   └── controlvect_fluxes_N2O.nc
├── biofluxes
│   ├── controlvect_biofluxes_CO2.nc
│   └── controlvect_biofluxes_N2O.nc
└── inicond
    ├── controlvect_inicond_CO2.nc
    └── controlvect_inicond_CH4.nc

Only optimised variables (i.e., those declared with the hresol keyword in the YAML) are written.

In each netCDF file, the structure will be as follows:

netcdf controlvect_fluxes_CH4 {
dimensions:
    time = 2 ;
    lev = 1 ;
    lat = 80 ;
    lon = 100 ;
    time_phys = 26 ;
    latc = 81 ;
    lonc = 101 ;
variables:
    int64 time(time) ;
        time:units = "days since 2019-01-01 00:00:00" ;
        time:calendar = "proleptic_gregorian" ;
    int64 lev(lev) ;
    double x(time, lev, lat, lon) ;
        x:_FillValue = NaN ;
    double xb(time, lev, lat, lon) ;
        xb:_FillValue = NaN ;
    double b_std(time, lev, lat, lon) ;
        b_std:_FillValue = NaN ;
    int64 time_phys(time_phys) ;
        time_phys:units = "hours since 2019-01-01 00:00:00" ;
        time_phys:calendar = "proleptic_gregorian" ;
    double x_phys(time_phys, lev, lat, lon) ;
        x_phys:_FillValue = NaN ;
    double xb_phys(time_phys, lev, lat, lon) ;
        xb_phys:_FillValue = NaN ;
    double b_phys(time_phys, lev, lat, lon) ;
        b_phys:_FillValue = NaN ;
    double latitudes(lat, lon) ;
        latitudes:_FillValue = NaN ;
    double latitudes_corner(latc, lonc) ;
        latitudes_corner:_FillValue = NaN ;
    double longitudes(lat, lon) ;
        longitudes:_FillValue = NaN ;
    double longitudes_corner(latc, lonc) ;
        longitudes_corner:_FillValue = NaN ;

// global attributes:
        :_NCProperties = "version=1|netcdflibversion=4.6.1|hdf5libversion=1.10.4" ;
}

The NetCDF variables and their meaning:

Variable

Description

x

Current (posterior) control vector at the inversion resolution (scaling factor or physical value, depending on the type YAML key).

xb

Prior (background) control vector at the inversion resolution.

b_std

Prior uncertainty (standard deviation, diagonal of \(\mathbf{B}^{1/2}\)) at the inversion resolution.

x_phys

Posterior control vector reprojected to the native physical resolution of the input data. Equal to x when type: physical. For type: scalar, x_phys = x × input values.

xb_phys

Prior reprojected to native physical resolution.

b_phys

Prior uncertainty reprojected to native physical resolution.

The horizontal extent covers the full model domain. The temporal resolution of variables without the _phys suffix matches the control-vector resolution set by tresol / tsubresol in the YAML. Variables with the _phys suffix use the finest resolution between the control-vector and the native input data.

Dumping and loading control vectors#

Dump#

The control vector is dumped by the function :

pycif.plugins.controlvects.standard.dump.dump(self, cntrl_file, to_netcdf=False, dir_netcdf=None, ensemble=False, **kwargs)[source]#

Dumps a control vector into a pickle file. Does not save large correlations.

Parameters:
  • self (pycif.utils.classes.controlvects.ControlVect) – the Control Vector to dump

  • cntrl_file (str) – path to the file to dump as pickle

  • to_netcdf (bool) – save to netcdf files if True

  • dir_netcdf (str) – root path for the netcdf directory

Load#

The control vector is loaded by the function :

pycif.plugins.controlvects.standard.dump.load(self, cntrl_file, component2load=None, tracer2load=None, target_tracer=None, ensemble=False, **kwargs)[source]#