Automatic resubmission of jobs

Automatic resubmission of jobs#

This option stands mainly for CCRT users. Because the maximum walltime of batch jobs is limited to 3 days, it may be necessary to relaunch the inversion. To achieve this, two optional arguments must be set in the obsoper as described below. The inversion will be automatically stopped after a given maximum duration autokill_time, and a new job will be resubmitted. It is important to set a sufficient margin between the autokill_time and the walltime of the cluster, because the kill/resubmit process occurs at the end of the loop of transforms. Therefore there has to be enough time between them to ensure that all transforms will be over before the job is killed. In order to achieve the resubmission:

  1. Modify the yaml allow for resubmission:

    • in obsoper, set option autokill_time to the duration after which the job is resubmitted.

    • in simulator, set option max_resubmissions.

obsoperator:
  plugin:
    name: standard
    version: std
    type: obsoperator
  autokill_time: 71H
  max_resubmissions: 5
  1. Depending on the user’s configuration of the CIF at CCRT, it may be necessary to add a python userbase path with python_userbase and a virtual environment path with python_venv.

  2. A new job will be resumitted when autokill_time is reached, allowing to perform inversions on periods longer 3 days.