Random generation of observations random/param#

Description#

Synthetic measurement plugin — random observations for OSSEs.

Generates a fully synthetic observation data store by placing nstations virtual stations at random positions within the model domain and producing observations at a regular frequency over the inversion window.

Observation values are drawn from a uniform distribution on [obs_min, obs_max]; observation errors are a uniform fraction (0–10 %) of the value range.

Key design choices#

  • Reproducibility — set seed: true and choose a seed_id to obtain a repeatable pseudo-random draw across runs.

  • Sub-period jitterrandom_subperiod_shift: true adds a uniform random offset within each frequency bin so observations are not perfectly aligned on the clock.

  • Named stations — by default stations are labelled 0 nstations-1. Provide station_names to assign human-readable IDs, which is required when using a Lagrangian model that matches footprints by station name.

This plugin is primarily intended for Observing System Simulation Experiments (OSSEs) and for testing/debugging the inversion pipeline.

YAML arguments#

The following arguments are used to configure the plugin. pyCIF will return an exception at the initialization if mandatory arguments are not specified, or if any argument does not fit accepted values or type:

Mandatory arguments#

nstations : int, mandatory

Number of stations to generate

Optional arguments#

frequency : str, optional, default “1h”

Frequency of generated observations

date_shift : str, optional, default “0h”

Shift from default frequency

duration : str, optional, default “1h”

Duration of generated observations

zmax : float, optional, default 100

Stations are randomly located between 0 m a.g.l. and zmax m a.g.l

obs_min : float, optional, default 0

Lower range of the observations for uniform distribution

obs_max : float, optional, default 1

Upper range of the observations for uniform distribution

random_subperiod_shift : bool, optional, default False

Randomly shift observations within their frequency. For instance, if the frequency is hourly, and random_subperiod_shift is True, hourly observations will be generated, but with a random shift of 0-60 minutes for each observations.

seed : bool, optional, default False

Use a fixed seed to generate observations

seed_id : int, optional, default 0

The seed to be used to generate observations. The associated command is np.random.seed

station_names : list, optional

List of station names to be used. Should be at least larger than nstations

Requirements#

The current plugin requires the present plugins to run properly:

Requirement name

Requirement type

Explicit definition

Any valid

Default name

Default version

domain

Domain

False

True

None

None

YAML template#

Please find below a template for a YAML configuration:

 1measurements:
 2  plugin:
 3    name: random
 4    version: param
 5    type: measurements
 6
 7  # Mandatory arguments
 8  nstations: XXXXX  # int
 9
10  # Optional arguments
11  frequency: XXXXX  # str
12  date_shift: XXXXX  # str
13  duration: XXXXX  # str
14  zmax: XXXXX  # float
15  obs_min: XXXXX  # float
16  obs_max: XXXXX  # float
17  random_subperiod_shift: XXXXX  # bool
18  seed: XXXXX  # bool
19  seed_id: XXXXX  # int
20  station_names: XXXXX  # list