A Guide to Cluster Computing for Research

Cluster
Computation
11 June 2024

Cluster computing involves using multiple connected computers (nodes) to work together as a single system. This is particularly useful for computationally intensive tasks common in data science and economics research, especially structural estimation.

Most clusters are accessed remotely via Secure Shell (SSH). The general command is:

ssh username@cluster_address

Basic Commands:

ls # List files and directories
cd /path/to/directory: # Change directory
pwd: # Print working directory
mkdir new_directory: # Create a new directory
rm file: # Remove a file
rm -r directory: # Remove a directory and its contents
cp source destination: # Copy files
mv old_name new_name: # Move or rename files

To copy files between your local machine and the cluster:

# Upload a file
scp local_file username@cluster_address:/remote/path/ 

# Download a file
scp username@cluster_address:/remote/path/file local_destination/

# Upload a directory
scp -R local_directory username@cluster_address:/remote/path/ 

# Download a directory
scp -R username@cluster_address:/remote/path/directory local_destination/

Many clusters use environment modules to manage software. Common commands:

module avail # List available software
module load software_name # Load a specific software
module list # Show currently loaded modules
module unload software_name # Unload a module

Example:

module load R/4.1.0

Most clusters use a job scheduler like SLURM or PBS. Here's a general structure for a job script:

#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --output=output_%j.log
#SBATCH --error=error_%j.log
#SBATCH --time=01:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --mem=4G

module load R/4.1.0

Rscript my_script.R

Some useful commands:

sbatch job_script.sh # Submit a job
squeue # Check job status
scancel job_id # Cancel a job

When using R on a cluster, you can leverage parallel computing:

library(parallel)

# Detect number of cores
num_cores <- detectCores()

# Create a cluster object
cl <- makeCluster(num_cores)

# Run your parallel code
results <- parLapply(cl, 1:100, function(x) {
  # Your computation here
})

# Stop the cluster
stopCluster(cl)

This is a general guide to start - but you should check your institution’s specific documentation for cluster details. Happy clustering!

Copyright © Ornella Darova 2024
Email Twitter (X) LinkedIn
Email copied!