Send long-running or parallel jobs to a Slurm workload manager (i.e. cluster) using the slurm_call, slurm_apply, or slurm_map functions.

Job submission

This package includes three core functions used to send computations to a Slurm cluster: 1) slurm_call executes a function using a single set of parameters (passed as a list), 2) slurm_apply evaluates a function in parallel for each row of parameters in a given data frame, and 3) slurm_map evaluates a function in parallel for each element of a list. The functions slurm_apply and slurm_map automatically split the parameter rows or list elements into equal-size chunks, each chunk to be processed by a separate cluster node. They use functions from the parallel-package package to parallelize computations across processors on a given node.

The output of slurm_apply, slurm_map, or slurm_call is a slurm_job object that serves as an input to the other functions in the package: print_job_status, cancel_slurm, get_slurm_out and cleanup_files.

Function specification

To be compatible with slurm_apply, a function may accept any number of single value parameters. The names of these parameters must match the column names of the params data frame supplied. There are no restrictions on the types of parameters passed as a list to slurm_call or slurm_map

If the function passed to slurm_call or slurm_apply requires knowledge of any R objects (data, custom helper functions) besides params, a character vector corresponding to their names should be passed to the optional global_objects argument.

When parallelizing a function, since any error will interrupt all calculations for the current node, it may be useful to wrap expressions which may generate errors into a try or tryCatch function. This will ensure the computation continues with the next parameter set after reporting the error.

Output Format

The default output format for get_slurm_out (outtype = "raw") is a list where each element is the return value of one function call. If the function passed to slurm_apply produces a vector output, you may use outtype = "table" to collect the output in a single data frame, with one row by function call.

Slurm Configuration

Advanced options for the Slurm workload manager may accompany job submission by slurm_call, slurm_map, and slurm_apply through the optional slurm_options argument. For example, passing list(time = '1:30:00') for this options limits the job to 1 hour and 30 minutes. Some advanced configuration must be set through environment variables. On a multi-cluster head node, for example, the SLURM_CLUSTERS environment variable must be set to direct jobs to a non-default cluster.

Examples


if (FALSE) {
# Create a data frame of mean/sd values for normal distributions 
pars <- data.frame(par_m = seq(-10, 10, length.out = 1000), 
                   par_sd = seq(0.1, 10, length.out = 1000))
                   
# Create a function to parallelize
ftest <- function(par_m, par_sd) {
 samp <- rnorm(10^7, par_m, par_sd)
 c(s_m = mean(samp), s_sd = sd(samp))
}

sjob1 <- slurm_apply(ftest, pars)
print_job_status(sjob1)
res <- get_slurm_out(sjob1, "table")
all.equal(pars, res) # Confirm correct output
cleanup_files(sjob1)
}