Observation vectors

Contents

Observation vectors#

In pyCIF, the observation vector \(\mathbf{y}\) is stored as a directory, organised by observation type and species. Each file follows the same format as the input monitor files.

obsvect/
├── {obs type 1}/
│   ├── {species 1}/
│   │   └── monitor.nc
│   └── {species 2}/
│       └── monitor.nc
└── {obs type 2}/
    ├── {species 1}/
    │   └── monitor.nc
    └── {species 2}/
        └── monitor.nc

NetCDF format#

Each monitor.nc uses NetCDF groups to separate the two types of information:

maindata:

Numerical data directly used by the inversion:

  • obs — observed values

  • obserror — observation errors (\(\sigma_\varepsilon\))

  • sim — simulated equivalents (forward run)

  • sim_tl — simulated increments (tangent-linear run)

  • adj_out — adjoint sensitivities

  • incr — observation-space increments

metadata:

All other information about each observation point: date, station, network, parameter, lon, lat, alt, level, i, j, tstep, tstep_glo, dtstep, dtstep_glo. See Observations for the full description of each field.

Each monitor.nc is formatted as illustrated in the following example (real file from a CHIMERE run):

netcdf monitor {

// global attributes:
		:_NCProperties = "version=2,netcdf=4.8.0,hdf5=1.10.6" ;

group: maindata {
  dimensions:
  	index = 10912 ;
  variables:
  	int64 index(index) ;
  	double obs(index) ;
  		obs:_FillValue = NaN ;
  	double obserror(index) ;
  		obserror:_FillValue = NaN ;
  	double sim(index) ;
  		sim:_FillValue = NaN ;
  	double sim_tl(index) ;
  		sim_tl:_FillValue = NaN ;
  	double adj_out(index) ;
  		adj_out:_FillValue = NaN ;
  	double spec(index) ;
  		spec:_FillValue = NaN ;
  	double incr(index) ;
  		incr:_FillValue = NaN ;

  // group attributes:
  		:datei = "01-01-2008 00:00:00" ;
  		:datef = "01-01-2009 00:00:00" ;
  		:model\ name = "CHIMERE" ;
  		:model\ version = "std" ;
  		:domain\ nlon = "101" ;
  		:domain\ nlat = "85" ;
  		:history = "Created on 11-05-2022 20:42:13" ;
  } // group maindata

group: metadata {
  dimensions:
  	index = 10912 ;
  	station_id = 16 ;
  	network_id = 1 ;
  	parameter_id = 1 ;
  variables:
  	int64 index(index) ;
  	double alt(index) ;
  		alt:_FillValue = NaN ;
  	int64 date(index) ;
  		date:units = "minutes since 2008-04-10 12:00:00" ;
  		date:calendar = "proleptic_gregorian" ;
  	double dtstep(index) ;
  		dtstep:_FillValue = NaN ;
  	double duration(index) ;
  		duration:_FillValue = NaN ;
  	double i(index) ;
  		i:_FillValue = NaN ;
  	byte is_obsvect(index) ;
  		is_obsvect:dtype = "bool" ;
  	double j(index) ;
  		j:_FillValue = NaN ;
  	double lat(index) ;
  		lat:_FillValue = NaN ;
  	double level(index) ;
  		level:_FillValue = NaN ;
  	double lon(index) ;
  		lon:_FillValue = NaN ;
  	int64 network(index) ;
  	int64 parameter(index) ;
  	int64 station(index) ;
  	double tstep(index) ;
  		tstep:_FillValue = NaN ;
  	double tstep_glo(index) ;
  		tstep_glo:_FillValue = NaN ;
  	string list_stations(station_id) ;
  	string list_networks(network_id) ;
  	string list_parameters(parameter_id) ;

  // group attributes:
  		:datei = "01-01-2008 00:00:00" ;
  		:datef = "01-01-2009 00:00:00" ;
  		:model\ name = "CHIMERE" ;
  		:model\ version = "std" ;
  		:domain\ nlon = "101" ;
  		:domain\ nlat = "85" ;
  		:history = "Created on 11-05-2022 20:42:14" ;
  } // group metadata
}