Centre de Calcul Recherche et Technologie with NVIDIA environnement TGCC-CCRT/nvidia

Centre de Calcul Recherche et Technologie with NVIDIA environnement TGCC-CCRT/nvidia#

Description#

This plugin deals with specific environment characteristics of the cluster at the Très Grand Centre de calcul (France), more specifically the Nvidia GPUs partitions of the Centre de Calcul Recherche et Technologie.

YAML arguments#

The following arguments are used to configure the plugin. pyCIF will return an exception at the initialization if mandatory arguments are not specified, or if any argument does not fit accepted values or type:

Optional arguments#

python : str, optional, default “python -m mpi4py -rc initialize=False”

the python command used to run sub-instances of pyCIF

python_venv : str, optional

path to the python virtual environment to use

python_module : str, optional

the python module to load, by default python3/3.10.6. This argument can not be used together with ‘job_env’.

job_env : “list of str”, optional

List of commands to execute at the start of the job script to setup the environment. Using this argument will override the default module loading. This argument can not be used together with ‘python_module’. The list list of commands should not include any python virtual environment activation command, please use the ‘python_venv’ argument for that purpose.

gpu : bool, optional, default False

change the command used to run parallel programs in order to allocate one GPU

partition : “rome” or “skylake” or “a64fx” or “v100” or “v100l” or “v100l-os” or “hybrid” or “xlarge” or “v100xl”, optional

partition on which to submit job, used as the -q option for the cc_msub command

project : str, optional

project on which to submit job, used as the -A option for the cc_msub command

filesystem : “Any subset of [‘scratch’, ‘work’, ‘store’], separated by commas or ‘all’”, optional, default “all”

the file system(s) required by the job, used as the -m option for the cc_msub command

qos : “long” or “normal” or “test”, optional, default “normal”

Quality of Service (QoS) used to submit job, used as the -Q option for the cc_msub command

walltime : int, optional, default 7200

maximum walltime of the submited job, used as the -T option for the cc_msub command

nodes : int, optional, default 1

number of nodes to use by the jobs launched by the CIF, used as the -n option for the cc_msub command

cores : int, optional, default 1

number of cores to use by the jobs launched by the CIF, used as the -c option for the cc_msub command

submit_msub : bool, optional, default True

Submit the job with ccc_msub. If false simply run it within the same instance

Requirements#

The current plugin requires the present plugins to run properly:

Requirement name

Requirement type

Explicit definition

Any valid

Default name

Default version

model

Model

True

True

None

None

YAML template#

Please find below a template for a YAML configuration:

 1platform:
 2  plugin:
 3    name: TGCC-CCRT
 4    version: nvidia
 5    type: platform
 6
 7  # Optional arguments
 8  python: XXXXX  # str
 9  python_venv: XXXXX  # str
10  python_module: XXXXX  # str
11  job_env: XXXXX  # list of str
12  gpu: XXXXX  # bool
13  partition: XXXXX  # rome|skylake|a64fx|v100|v100l|v100l-os|hybrid|xlarge|v100xl
14  project: XXXXX  # str
15  filesystem: XXXXX  # Any subset of ['scratch', 'work', 'store'], separated by commas or 'all'
16  qos: XXXXX  # long|normal|test
17  walltime: XXXXX  # int
18  nodes: XXXXX  # int
19  cores: XXXXX  # int
20  submit_msub: XXXXX  # bool