Files for checks and debugging#

CIF generates several diagnostic files automatically in $WORKDIR to help users verify that the pipeline is configured and executing as expected. All files described below are written before or during the first forward run and are refreshed on every subsequent call.

Obs-operator diagnostic files#

The three files below are written to $WORKDIR/obsoperator/ at pipeline initialisation time (before any model run).

1. pipe_inputs.txt#

Written by the data-vector consistency check. Contains a YAML dump of self.required_inputs — the full map of which component/parameter pairs each transform in the pipeline requires as inputs, derived from the mapper of every active transform.

How to use it: Read this file to verify that all the data streams you expect to be consumed by the pipeline are listed. If a component/parameter pair appears in the file but is not present in your datavect YAML block, CIF will raise a missing inputs exception and write a detailed report to missing_inputs.txt (and missing_inputs_init.txt for initialisation-only inputs) in the same directory.

Example excerpt:

flux:
  CH4:
    - run_model
    - fromcontrol
concs:
  CH4:
    - toobsvect
meteo:
  '':
    - run_model

2. transform_description.txt#

Written by dump_transform_description(). For every active transform in the pipeline, lists:

  • the transform name, plugin name and version;

  • its inputs — the component/parameter pairs it reads;

  • its outputs — the component/parameter pairs it writes;

  • its precursors — transforms whose outputs feed into it;

  • its successors — transforms that consume its outputs.

How to use it: Inspect this file to understand the data-flow topology of your pipeline. It is the human-readable counterpart to the internal mapper dictionary and is the first place to look when a transform receives unexpected or missing data.

Example excerpt:

######################################
run_model (dummy / std):
--------
Inputs:
    - (flux, CH4)
    - (meteo, )
--------
Outputs:
    - (concs, CH4)
--------
Precursors:
    - (flux, CH4):
        - fromcontrol
--------
Successors:
    - (concs, CH4):
        - loadfromoutputs

3. transform_pipe_forward.txt and transform_pipe_adjoint.txt#

Written by dump_transform_description() alongside transform_description.txt. Each file lists the execution order for one pass (forward or adjoint), one transform per line, in the format:

transform_name   (plugin_name / version) date / direction / [input tracers]

How to use it: Read these files to see the exact sequence of operations that CIF will execute and for which sub-simulation dates. Useful for diagnosing ordering issues (e.g., an adjoint onlyinit step running out of sequence) and for understanding why a particular transform is or is not executed.

Example excerpt (forward pass):

fromcontrol     (fromcontrol  / std    ) 2010-01-01 / forward  / ['flux']
run_model       (dummy        / std    ) 2010-01-01 / forward  / ['flux', 'meteo']
loadfromoutputs (loadfromoutputs / std ) 2010-01-01 / forward  / ['concs']
toobsvect       (toobsvect    / std    ) 2010-01-01 / forward  / ['concs']

Data-vector debug files#

When the dump_debug option is enabled in the datavect YAML block, CIF writes one text file per component/parameter pair to $WORKDIR/datavect/, listing every input file that was located for that data stream and its associated date range.

YAML configuration:

datavect:
  plugin:
    name: standard
    version: std
  dump_debug: true
  components:
    ...

Output files: $WORKDIR/datavect/{component}.{parameter}.txt

Each file lists, for every sub-simulation date, the input date windows and the corresponding file paths that CIF resolved for that tracer:

20100101 00:00
    2010-01-01 -> 2010-01-02: /data/fluxes/CH4_2010_01.nc
    2010-01-02 -> 2010-01-03: /data/fluxes/CH4_2010_01.nc
20100103 00:00
    2010-01-03 -> 2010-01-04: /data/fluxes/CH4_2010_01.nc

How to use it: This is the primary tool for verifying that CIF is reading the files you expect for each tracer, at the correct dates. Enable it when:

  • you suspect a wrong file is being used for a given period;

  • a tracer is reading from an unexpected path;

  • you want to confirm that temporal interpolation is anchored to the right input dates.

Note

dump_debug has no effect on the simulation results — it only adds the text-file logging. It can be safely enabled on production runs for a one-time audit, then disabled.