Running ADAM on Slurm¶
For those groups with access to a HPC cluster with Slurm managing a number of compute nodes with local and/or network attached storage, it is possible to spin up a temporary Spark cluster for use by ADAM.
The full IO bandwidth benefits of Spark processing are likely best realized through a set of co-located compute/storage nodes. However, depending on your network setup, you may find Spark deployed on HPC to be a workable solution for testing or even production at scale, especially for those applications which perform multiple in-memory transformations and thus benefit from Spark’s in-memory processing model.
Follow the primary
for installing ADAM into
$ADAM_HOME. This will most likely be at a
location on a shared disk accessible to all nodes, but could be at a
consistant location on each machine.
Start Spark cluster¶
A Spark cluster can be started as a multi-node job in Slurm by creating
a job file
run.cmd such as below:
#!/bin/bash #SBATCH --partition=multinode #SBATCH --job-name=spark-multi-node #SBATCH --exclusive #Number of seperate nodes reserved for Spark cluster #SBATCH --nodes=2 #SBATCH --cpus-per-task=12 #Number of excecution slots #SBATCH --ntasks=2 #SBATCH --time=05:00:00 #SBATCH --mem=248g # If your sys admin has installed spark as a module module load spark # If Spark is not installed as a module, you will need to specifiy absolute path to # $SPARK_HOME/bin/spark-start where $SPARK_HOME is on shared disk or at a consistant location start-spark echo $MASTER sleep infinity
Submit the job file to Slurm:
This will start a Spark cluster containing two nodes that persists for
five hours, unless you kill it sooner. The file
slurm.out created in
the current directory will contain a line produced by
above which will indicate the address of the Spark master to which your
application or ADAM-shell should connect such as
Your sys admin will probably prefer that you launch your
or start an application from a cluster node rather than the head node
you log in to. You may want to do so with:
Start an adam-shell:
$ADAM_HOME/bin/adam-shell --master spark://hostnamefromslurmdotout:7077
or Run a batch job with adam-submit¶
$ADAM_HOME/bin/adam-submit --master spark://hostnamefromslurmdotout:7077
You should be able to connect to the Spark Web UI at
http://hostnamefromslurmdotout:4040, however you may need to ask
your system administrator to open the required ports.