Working directory structure ############################ Every pyCIF run writes its outputs to a dedicated :bash:`$WORKDIR` directory whose path is set by the :bash:`workdir` key in the YAML configuration file. The directory is created automatically if it does not exist. The tree below describes a complete forward-run working directory for the toy Gaussian model. Other configurations (inversions, multiple modes, ensemble methods) add further sub-directories but follow the same conventions. .. code-block:: text $WORKDIR/ │ ├── config.yml ← copy of the YAML configuration used ├── pycif.log ← full run log (name set by logfile:) ├── VERSION ← git branch and commit hash for reproducibility │ ├── datavect/ ← data-vector inputs, resolved at startup │ ├── {component}/ │ │ └── {parameter}/ ← raw input files as read by the datastream plugin │ └── {component}.{parameter}.txt ← file/date list (dump_debug: true only) │ ├── controlvect.pickle ← prior / posterior control vector (forward / inversion) ├── controlvect/ ← control-vector in human-readable NetCDF format │ └── {component}/ │ └── controlvect_{component}_{parameter}.nc ← netCDF-format snapshot │ ├── model/ ← model-specific outputs and cached data │ └── H_matrix.pickle ← observation operator matrix (dummy model) │ └── obsoperator/ │ ├── pipe_inputs.txt ← data requirements of every transform (YAML) ├── transform_description.txt ← inputs/outputs/precursors/successors ├── transform_pipe_forward.txt ← forward execution order ├── transform_pipe_adjoint.txt ← adjoint execution order │ ├── fwd_0000/ ← forward run #0 (index incremented for each call) │ ├── controlvect.pickle ← control vector used for this run │ ├── controlvect/ ← netCDF snapshot of the control vector │ ├── finished_transforms.txt ← list of completed transforms (restart support) │ ├── obsvect/ ← simulated observation vector │ │ └── {component}/{parameter}/monitor.nc │ └── {YYYY-MM-DD_HH-MM}/ ← one sub-directory per sub-simulation period │ ├── {model inputs and outputs} │ └── chain/ ← files chained to the next sub-period (e.g. end-concentrations) │ └── adj_0000/ ← adjoint run #0 (inversion and adj-TL test only) └── ... ← same structure as fwd_0000/ Key files and conventions ========================== ``pycif.log`` ------------- The main run log. Its name is set by the :bash:`logfile` key in the YAML. Verbosity is controlled by the :bash:`verbose` key (0 = errors only, 1 = info, 2 = debug). ``VERSION`` ----------- Records the git branch and commit hash at the time of the run. Used for reproducibility: re-running from the same :bash:`VERSION` and :bash:`config.yml` should give bit-identical results. ``datavect/`` ------------- Contains the input data as read and interpolated by each datastream plugin. The exact sub-structure depends on the plugins used. When :bash:`dump_debug: true` is set in the datavect YAML block, a :bash:`{component}.{parameter}.txt` file is added for each tracer listing the resolved file paths and date ranges — see :doc:`check` for details. ``obsoperator/fwd_NNNN/`` ------------------------- One directory per call to the forward observation operator, numbered from ``0000``. Numbering ensures that multiple calls (e.g. in an inversion loop) never overwrite each other. The ``run_id`` parameter passed to :meth:`obsoper` controls the index. ``obsoperator/fwd_0000/{YYYY-MM-DD_HH-MM}/`` -------------------------------------------- One sub-directory per sub-simulation period (model ``subsimu_dates``). Contains all the model input files, output files, and intermediate products for that period. The :bash:`chain/` sub-directory holds fields that must be passed forward in time (e.g. end-concentration fields used as initial conditions for the next period). ``finished_transforms.txt`` --------------------------- Written inside each ``fwd_NNNN/`` directory. Contains a semicolon-separated list of transforms that completed successfully. Used by the autorestart mechanism: when :bash:`autorestart: true` is set in the obsoperator YAML block, any transform already listed here is skipped on a restart, allowing interrupted runs to resume from the point of failure. Inversion-specific outputs =========================== For variational inversions (:bash:`mode: 4dvar`), the following additional directories appear under :bash:`$WORKDIR`: .. code-block:: text $WORKDIR/ ├── simulator/ │ ├── cost.txt ← cost function value at each iteration (CSV format) │ ├── cost.csv ← same, machine-readable │ ├── gradcost.txt ← gradient norm at each iteration │ └── gradcost.csv └── controlvect/ ├── controlvect_final.pickle ← optimised posterior control vector └── controlvect/ ← netCDF posterior control vector For ensemble methods (:bash:`mode: EnSRF`), an :bash:`ensemble/` sub-directory is created under :bash:`$WORKDIR` containing one sample directory per ensemble member. .. seealso:: * :doc:`check` — diagnostic files generated during pipeline initialisation and execution. * :doc:`controlvect` — format of the control-vector netCDF files. * :doc:`obsvect` — format of the observation-vector :bash:`monitor.nc` files. * :doc:`monitor` — structure and columns of monitor files.