How to execute R scripts on a cluster?

Generality

To run R scripts on a cluster, you first need to have an account. The request should be made to the cluster administrator. Once your account is created, you can connect to the cluster using the ssh command on Linux, macOS, and Windows (for Windows users, you will need to set up a Unix-like operating system). Typically, there is documentation available that explains the cluster’s architecture, which is provided by the administrator after your account is created. To transfer files and folders between your local machine and the cluster, you can use the scp command.

On nuwa, R is already installed on the cluster with some packages already available. New packages can be installed and managed using conda.

$ source /home/.../.../.../.../conda.sh  # to access conda
$ conda create -n myenv                  # create an environment (myenv)
$ conda activate r_myenv                 # activate the environment

More documentations about environment management is available in the cluster tutorial. After activating your environment, you can run R in interactive mode and install all packages needed for your job in a personal library using install.packages command line.

(r_myenv) $ R              # to run R in interactive session

Note that your personal library is usually located at the root of your personal directory, which has very limited allocated space. To avoid saturation by installed packages, do:

$ mkdir ~/work/R          # to create a directory named R elsewhere and 
$ ln -s ~/work/R ~/R      # to make a symbolic link to this directory

You can run R in interactive mode with the command R but it is advised to not run treatments on frontal servers, always use sbatch or srun.

Run an R script in batch mode

To Run an R script in batch mode, you need to write your R script first and then a bash script. The bash script tell the bash shell what it should do. It is just a plain text file which contains a series of commands with the .sh extension. All commands that can normally run on the command line (like, ls, mkdir …) can be placed in a bash script.

1. Write an R script (e.g. write_mto.R)

# write_mto.R--------------------
# write meteo files for 2D STEP
# yelognisse agbohessou
# 2020-03-20
# Write .mto file in STEP format for each pixel (site) using rstep function: gen_step_mto

rm(list=ls())

# import pkgs
require(rstep)
require(tibble)
require(lubridate)

# import climate dataset
meteo_2019_2021 = read.csv("~/meteo_sahel_2019_2021.csv")
sum_data = meteo_2019_2021[-1]

# write mto file fo reach grid
for (isite in unique(sum_data$id)){
  df = sum_data[sum_data$id %in% isite,]
  meteo.y <- df[,c("date","year","rain","RayGlo_MJm","temp_min","temp_max","pvap_2_m_hPa","Wind_Speed")]
  for (i in 2019:2021){
    df = meteo.y[meteo.y$year %in% i,]
    df <- add_column(df, day = day(df$date), .after = 1)
    rstep::gen_step_mto(workspace = "~/Input_mto/",dataframe=df,isite=isite,
                        year = substr(unique(df$year),3,4),alt=alt,lat=lat,hautmes=2)
  }
}
-------------------------------

2. write a bash script (e.g. write_mto_bash.sh)

## write_mto_bash.sh---------------
#!/bin/sh
#SBATCH --job-name=write_mto       # job-name in the queue
#SBATCH --output="mto_out_%j"      # filename of the output
#SBATCH --error="mto_error_%j"     # filename of the error
#SBATCH --ntasks-per-node=28       # number of CPUs per node
#SBATCH --partition=edi_15IB       # the partitions to run in (comma seperated)
#SBATCH --nodes=2                  # number of nodes used
#SBATCH --mem-per-cpu=50M          # memory per CPU core
#SBATCH --time=05:00               # HH:MM:SS: time limit (default to 04:00:00)
#SBATCH  --cpus-per-task=20        # number of threads allocated to each task 
#SBATCH -p short-28core
#SBATCH --mail-type=BEGIN,END,FAIL # send an email at begining, end or fail of the script 
#SBATCH --mail-user=user_mail

# access to conda
source /../../../../conda.sh

# activate my environment
conda activate myenv

echo "BEGIN"
hostname
echo  "#################################"

# the command lines to run on the cluster
Rscript write_mto.R

echo "waiting"
wait
echo "END"
----------------------------------

This job will utilize 2 nodes, with 28 CPUs per node for 5 minutes in the short-28core queue to run the write_mto.R script

3. Submit the job to the cluster

sbatch write_mto_bash.sh                      # to run the job an the available partition 
sbatch -p mpib40_SIRO -N7 write_mto_bash.sh   # to run the job using 7 nodes of the partition mpib40_SIRO

Job managment

After lunching the job, it can be monitor using these commands

$ squeue -u username     # to check if the job is running
$ squeue jobID           # to check if the job is running
$ scancel jobID          # to cancel a running job

If the jobs haven’t started you can put them on hold with qhold. Use qrls to restart.

$ qhold jobID            # to put a running job on hold 
$ qrls jobID             # to restart a job

If the job is running use qsig to suspend and resume jobs (you may need extra permissions for that, ask your administrator if that’s the case)

$ qsig -s suspend jobID  # suspend the job 
$ qsig -s resume jobID   # resume the job
$ qrun jobID             # Once you have resumed your job you may have to force it to run with qrun

Some basic and useful linux commands

$ mkdir folder                                # to create a directory
$ touch filename                              # to create a file
$ nano filename                               # to edit a file (ctr+x and y to save)
$ cat filename                                # to concatenate and display files. 
$ cp -r ~/folder1/. ~/new_folder1             # to copy a directory with its content from a path to another
$ scp ./main.R user@nuwa:/home/user/          # to copy a file (main.R) to the cluster 
$ scp user@nuwa:/home/user/main.R .           # to copy a file (main.R) back to local 
$ scp -r /c/folder user@nuwa:/home/user/      # to copy a directory (folder) to the cluster
$ scp -r user@nuwa:/home/user/folder .        # to copy directory (folder) and its content back to local
$ rm -rf <directory1> <directory2>            # to remove entire directories 
$ rm -r --*/                                  # to remove all folder in a directory
$ echo */ | wc                                # to count the number of folder in a directory 
$ ls | wc -l                                  # to count the number of file in a directory 
$ ls-Artls | tail -4                          # to get the 4 last files  in a directory 
$ ssh                                         # to connect to a remote host.
$ kill                                        # to terminate a process by PID.
$ ps                                          # to display information about the running processes.
$ top                                         # to display system information and the process list.
$ sudo                                        # to execute a command as the superuser
$ gzip and gunzip                             # to compress and extract gzipped files.
$ chmod                                       # to change file or directory permissions.
$ echo                                        # to display message on the screen.
$ tar                                         # to create or extract tar archives.
$ rm                                          # to remove files and directories.
$ mv                                          # to move or rename files and directories.
$ touch                                       # to create a new file.
$ rmdir                                       # to remove an empty directory.
$ pwd                                         # to print the current working directory.
$ tail -3 file                                #  to get the last 3 lines of the file
$ head -3 file                                #  to get the first 3 lines of the file


# Transferring a large number of small files.
# Create a tar archive, then scp it as a single file and then ‘untar’ the file. 
$ tar cz ./source_dir | ssh user@nuwa 'tar xvz -C destination/path'    # from local to nuwa
$ ssh user@nuwa 'tar -C /source/path/ -cz source_dir' | tar -xz        # from nuwa back to local

# Transferring large files using the -C option of scp to first compress the file, send it, and then decompress it
$ scp -C ./file_name.nc user@nuwa:destination/path/
Yélognissè Frédi Agbohessou
Yélognissè Frédi Agbohessou
Post-doctoral researcher

My research interests include Climate Change, Agroforestry, Food security in West Africa, Ecological modeling and Remote sensing.

Related