About GPU Resources in O2
The first 2 GPU nodes are now available on O2, including: 4 Tesla M40 and 8 Tesla K80 GPUs. To list information about all the nodes with GPU resources you can use the command:
GPU Partition Limits
The amount of GPU resources that can be used at any time in the O2 cluster is measured in term of GPU hours / user, currently there is an active limit of 72 GPU hours for each user.
This means that at any time each user can allocate* at most 1 GPU card for 72 hours, 12 GPU cards for 6 hours or any intermediate combination, for example 6 GPU cards for 12 hours.
The current limit will be increased as we migrate additional GPU nodes from the older cluster to O2.
* as resources allow
How to compile cuda programs
In most cases a cuda library and compiler module must be loaded in order to compile cuda programs. To see which cuda modules are available use the command module spider cuda, then use the command module load to load the desired version. Currently only the latest version of Cuda toolkit (V 9) is available
How to submit a GPU job
To submit a GPU job in O2 you will need to use the partition gpu and add the flag --gres=gpu:1 to request a GPU resource. The example below shows how to start an interactive bash job requesting 1 CPU core and 1 GPU card:
While this other example shows how to submit a batch job requesting 2 GPU cards and 4 CPU cores:
It is also possible to request a specific type of GPU card by using the --gres flag. For example --gres=gpu:teslaM40:3 can be used to request 3 GPU Tesla M40 cards. Currently two GPU types are available: teslaM40 and teslaK80