Temporal interpolation and re-indexing time_interpolation/std#
Description#
time_interpolation transform: re-index data from one time grid to another.
Interpolates or resamples gridded (xarray) and observation-indexed (sparse / sampled pandas DataFrame) data to match the temporal resolution required by the succeeding transform in the pipeline.
Two data shapes are handled:
Array data (
sparse_in = False,sparse_out = False) — an xarray DataArray of shape(time, lev, lat, lon)is resampled, with duration-based weights.Sparse / sampled data (
sparse_out = Trueorsampled_out = True) — a pandas DataFrame indexed by observation is resampled to the new date window. Whenrecombine_periods = True, observations overlapping multiple sub-simulation periods are combined proportionally.
Temporal interpolation indexes are pre-computed in ini_mapper() via
calc_indexes() and cached in the mapper so that the
forward and adjoint passes do not recompute them.
Ensemble (batch sampling) runs are supported: multiple __sample#N
tracers are processed in parallel with nthreads threads (defaulting to
the number of available CPUs).
YAML arguments#
The following arguments are used to configure the plugin. pyCIF will return an exception at the initialization if mandatory arguments are not specified, or if any argument does not fit accepted values or type:
Mandatory arguments#
- method : “linear”, mandatory
Method by which the original data is temporally interpolated onto the output time-scale
Optional arguments#
- parameter : str, optional
Parameter name on which the transform works on
- component : str, optional
Component name on which the transform works on
- orig_parameter_plg : Plugin, optional
Plugin object on which the transform works on
- orig_component_plg : Plugin, optional
Corresponding component object on which the transform works on
- successor : str, optional
Name of the successor transform
- precursor : str, optional
Name of the precursor transform
- recombine_periods : str, optional, default True
Recombine inputs from different sub-periods. If False, data overlapping several periods will be taken from the period with the biggest overlap with the outputs
- sparse_in : bool, optional, default False
Set to
Truewhen the input data is a pandas DataFrame (observation-indexed sparse format) rather than a gridded xarray DataArray.
- sparse_out : bool, optional, default False
Set to
Truewhen the output should be a pandas DataFrame (observation-indexed sparse format).
- sampled_in : bool, optional, default False
Set to
Truewhen the input is already sampled at observation locations (i.e. thesampledflag is set in the preceding transform’s mapper).
- sampled_out : bool, optional, default False
Set to
Truewhen the output should be delivered as observation-sampled data.
- nthreads : int, optional, default 1
Number of parallel threads for ensemble (batch sampling) processing. Defaults to the number of available CPUs.
- debug_crop : int, optional, default 10000
Maximum number of dates to print in debug log messages. Raise to see the full date list; lower to keep logs readable.
YAML template#
Please find below a template for a YAML configuration:
1transform:
2 plugin:
3 name: time_interpolation
4 version: std
5 type: transform
6
7 # Mandatory arguments
8 method: XXXXX # linear
9
10 # Optional arguments
11 parameter: XXXXX # str
12 component: XXXXX # str
13 orig_parameter_plg: XXXXX # Plugin
14 orig_component_plg: XXXXX # Plugin
15 successor: XXXXX # str
16 precursor: XXXXX # str
17 recombine_periods: XXXXX # str
18 sparse_in: XXXXX # bool
19 sparse_out: XXXXX # bool
20 sampled_in: XXXXX # bool
21 sampled_out: XXXXX # bool
22 nthreads: XXXXX # int
23 debug_crop: XXXXX # int
See also