Centre de Calcul Recherche et Technologie with NVIDIA environnement TGCC-CCRT/nvidia#
Description#
This plugin deals with specific environment characteristics of the cluster at the Très Grand Centre de calcul (France), more specifically the Nvidia GPUs partitions of the Centre de Calcul Recherche et Technologie.
YAML arguments#
The following arguments are used to configure the plugin. pyCIF will return an exception at the initialization if mandatory arguments are not specified, or if any argument does not fit accepted values or type:
Optional arguments#
- python : str, optional, default “python -m mpi4py -rc initialize=False”
the python command used to run sub-instances of pyCIF
- python_venv : str, optional
path to the python virtual environment to use
- python_module : str, optional
the python module to load, by default python3/3.10.6. This argument can not be used together with ‘job_env’.
- job_env : “list of str”, optional
List of commands to execute at the start of the job script to setup the environment. Using this argument will override the default module loading. This argument can not be used together with ‘python_module’. The list list of commands should not include any python virtual environment activation command, please use the ‘python_venv’ argument for that purpose.
- gpu : bool, optional, default False
change the command used to run parallel programs in order to allocate one GPU
- partition : “rome” or “skylake” or “a64fx” or “v100” or “v100l” or “v100l-os” or “hybrid” or “xlarge” or “v100xl”, optional
partition on which to submit job, used as the
-qoption for thecc_msubcommand
- project : str, optional
project on which to submit job, used as the
-Aoption for thecc_msubcommand
- filesystem : “Any subset of [‘scratch’, ‘work’, ‘store’], separated by commas or ‘all’”, optional, default “all”
the file system(s) required by the job, used as the
-moption for thecc_msubcommand
- qos : “long” or “normal” or “test”, optional, default “normal”
Quality of Service (QoS) used to submit job, used as the
-Qoption for thecc_msubcommand
- walltime : int, optional, default 7200
maximum walltime of the submited job, used as the
-Toption for thecc_msubcommand
- nodes : int, optional, default 1
number of nodes to use by the jobs launched by the CIF, used as the
-noption for thecc_msubcommand
- cores : int, optional, default 1
number of cores to use by the jobs launched by the CIF, used as the
-coption for thecc_msubcommand
- submit_msub : bool, optional, default True
Submit the job with
ccc_msub. If false simply run it within the same instance
Requirements#
The current plugin requires the present plugins to run properly:
Requirement name |
Requirement type |
Explicit definition |
Any valid |
Default name |
Default version |
|---|---|---|---|---|---|
model |
True |
True |
None |
None |
YAML template#
Please find below a template for a YAML configuration:
1platform:
2 plugin:
3 name: TGCC-CCRT
4 version: nvidia
5 type: platform
6
7 # Optional arguments
8 python: XXXXX # str
9 python_venv: XXXXX # str
10 python_module: XXXXX # str
11 job_env: XXXXX # list of str
12 gpu: XXXXX # bool
13 partition: XXXXX # rome|skylake|a64fx|v100|v100l|v100l-os|hybrid|xlarge|v100xl
14 project: XXXXX # str
15 filesystem: XXXXX # Any subset of ['scratch', 'work', 'store'], separated by commas or 'all'
16 qos: XXXXX # long|normal|test
17 walltime: XXXXX # int
18 nodes: XXXXX # int
19 cores: XXXXX # int
20 submit_msub: XXXXX # bool