Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Code Block
#-----------------------------------------------------
#SBATCH -p short
#SBATCH  -t 1:00:00
#SBATCH --mem=8000
#SBATCH -n 1

module load matlab/2017a
matlab -nodesktop -r "myfunction(my_inputs)"
#-----------------------------------------------------


Another possibility is to use the flag wrap to pass the MATLAB command directly to the sbatch line. 

The equivalent of the above example is

Code Block
rp189@login01:~ module load matlab/2017a
rp189@login01:~ sbatch -p short -n 1 -t 1:00:00 --mem=8000 --wrap="matlab -nodesktop -r \"myfunction(my_inputs)\""


where the special character \ must be used before the internal set of parenthesis.


How to propagate MATLAB errors to the SLURM scheduler 

By default a SLURM job containing a MATLAB script will be recorded as "COMPLETED" or "TIME OUT" even when the executed MATLAB script fails. This is happening because the scheduler is executing and tracking the behavior of the command matlab -r "your_code"  rather than the outcome of the actual function your_code.

To ensure that the outcome of a MATLAB job is captured by the scheduler you can use the MATLAB try catch exit(1) end construct as shown in the example below:

    

Code Block
languagetext
% Matlab wrapper to catch and propagate a non-zero exit status
try
   your_code      
catch my_error      
    my_error
    exit(1)
end
exit

This script will run the function your_code and if no error is detected the script will then exit with SLURM reporting a successfully completed job. If instead your_code fails the script will catch and print the error message and will terminate MATLAB returning a non-zero exit status which will be then recorded by the scheduler as a failed job

Running Parallel MATLAB jobs on the O2 Cluster

It is possible to run MATLAB parallel jobs++ on the O2 cluster using either the local cluster profile or the O2 cluster profile 

(++ in order to run parallel the MATLAB scripts must contain parallel commands)

MATLAB Parallel jobs using the default

...

local

...

cluster profile

This method is ideal for parallel jobs that request ~15 cores or less

...

This approach can be used on any of the O2 partition with the exception of the mpi partition

Note 1: Several complex operations in MATLAB are already parallelized (intrinsic parallelization of libraries), if your script is serial but uses intensively these parallelized libraries you might still want to request at least 2 or 3 cores using this approach in order to retain the associated speedup performance.

Note 2: Using the local cluster profile when submitting multiple parallel jobs containing parpool based commands is not recommended. MATLAB creates additional files for this type of parallel jobs using its own job indexing. If two or more of these jobs are dispatched at the same time they might try to read/write the same hidden files creating a conflict. If you want to run batches of parallel jobs requiring a pool of workers you should use the c.batch approach described in the O2 cluster profile session.

MATLAB Parallel jobs using the custom O2 cluster profile

This method is ideal for parallel jobs that request 15~20 cores or more

It is possible to configure MATLAB so that it interacts with the SLURM scheduler. This allows MATLAB to directly submit parallel jobs to the SLURM scheduler and enables it to leverage cpu and memory resources across different nodes (distributed memory). You can find detailed information on how to set and use the O2 MATLAB cluster profile here

...