Differences

This shows you the differences between two versions of the page.

--- software:specfem [2017/03/28 17:10]
wphase
+++ software:specfem [2018/03/05 16:05]
wphase
@@ Line 1: / Line 1: @@
 ====== SPECFEM3D_GLOBE ======
-==== Running SPECFEM3D_GLOBE on the Strasbourg HPC cluster with intel17 and cuda7.5 ====
+==== Running SPECFEM3D_GLOBE on the Strasbourg HPC cluster with gnu 4.8 and cuda 7.5 ====
 === Setup the environment ===
@@ Line 8: / Line 9: @@
 module purge
 module load batch/slurm
-module load compilers/intel17
 module load compilers/cuda-7.5
-module load mpi/openmpi-2.0.i17.cuda75
-export CUDA_LIB=/usr/local/cuda/cuda-7.5/lib64
 export CUDA_INC=/usr/local/cuda/cuda-7.5/include
+export CUDA_LIB=/usr/local/cuda/cuda-7.5/lib64
+export PATH=/rpriv/ipgs/zac/openmpi-1.10.7/bin:$PATH
+export LD_LIBRARY_PATH=/rpriv/ipgs/zac/openmpi-1.10.7/lib:$LD_LIBRARY_PATH
+</code>
+Notice that we use default gnu compiler of the operating system:
+<code>
+$ gfortran --version
+GNU Fortran (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
 </code>
 === Compilation ===
-First make sure that required modules are loaded and CUDA_LIB, CUDA_INC environment variables are declared (see previous section). Create a run directory including directories ''DATABASE_MPI'', ''OUTPUT_FILES'', ''bin'' and ''DATA''.
+Before compilation, make sure that required modules are loaded and CUDA_LIB, CUDA_INC environment variables are declared (see previous section). Create a run directory including directories ''DATABASE_MPI'', ''OUTPUT_FILES'', ''bin'' and ''DATA''.
 In the directory ''DATA'', create ''CMTSOLUTION'', ''Par_file'' and ''STATIONS'' file (cf., SPECFEM3D_GLOBE documentation).
@@ Line 46: / Line 52: @@
 ln -s DATA/topo_bathy $rundir/DATA/topo_bathy
 </code>
+Below is a script doing all the configuration and compilation:
+<code>
+#!/bin/bash
+# Load modules
+module purge
+module load batch/slurm
+module load compilers/cuda-7.5
+module load mpi/openmpi-basic
+export CUDA_LIB=/usr/local/cuda/cuda-7.5/lib64
+export CUDA_INC=/usr/local/cuda/cuda-7.5/include
+# source directory
+rootdir=/b/home/ipgs/cmorales/specfem3d_globe
+# setting up run directory
+currentdir=`pwd`
+mkdir -p DATABASES_MPI
+mkdir -p OUTPUT_FILES
+rm -rf DATABASES_MPI/*
+rm -rf OUTPUT_FILES/*
+# configure and compile in the source directory
+cd $rootdir
+# configure
+./configure -with-cuda=cuda5
+# compiles for a forward simulation
+cp $currentdir/DATA/Par_file DATA/Par_file
+make clean
+make all
+# backup of constants setup
+cp setup/* $currentdir/OUTPUT_FILES/
+cp DATA/Par_file $currentdir/OUTPUT_FILES/
+# Copy executables/Model in the current directory
+cd $currentdir
+# copy executables
+mkdir -p bin
+cp $rootdir/bin/xmeshfem3D ./bin/
+cp $rootdir/bin/xspecfem3D ./bin/
+# Links data necessary directories
+# The example below is for s362ani... this part should be changed if another model is used
+cd DATA/
+ln -s $rootdir/DATA/crust2.0
+ln -s $rootdir/DATA/s362ani
+ln -s $rootdir/DATA/QRFSI12
+ln -s $rootdir/DATA/topo_bathy
+cd ../
+</code>
+=== Run with CPU ===
+Example of slurm script (number of CPU cores should be adapted to NPROC_XI and NPROC_ETA:
+<code>
+#!/bin/bash
+#SBATCH -p grant -A g2016a68   # Partition / Account
+#SBATCH -n 96                  # Number of CPU cores
+#SBATCH --job-name=SPECFEM
+#SBATCH -t 23:00:00            # Wall time
+# Load modules
+module purge
+module load batch/slurm
+module load mpi/openmpi-basic
+#
+echo Master on host `hostname`
+echo Time is `date`
+# Start time
+begin=`date +"%s"`
+# Run
+BASEMPIDIR=`grep LOCAL_PATH DATA/Par_file | cut -d = -f 2 `
+NPROC_XI=`grep NPROC_XI DATA/Par_file   | cut -d = -f 2 `
+NPROC_ETA=`grep NPROC_ETA DATA/Par_file  | cut -d = -f 2`
+NCHUNKS=`grep NCHUNKS DATA/Par_file    | cut -d = -f 2 `
+numcpus=$(( $NCHUNKS * $NPROC_XI * $NPROC_ETA ))
+# numcpus should be consistent with the -n option
+mkdir -p OUTPUT_FILES
+# backup files used for this simulation
+cp DATA/Par_file OUTPUT_FILES/
+cp DATA/STATIONS OUTPUT_FILES/
+cp DATA/CMTSOLUTION OUTPUT_FILES/
+##
+## mesh generation
+##
+sleep 2
+echo
+echo `date`
+echo "starting MPI mesher on $numcpus processors"
+echo
+mpirun -np $numcpus bin/xmeshfem3D
+echo "  mesher done: `date`"
+echo
+# backup important files addressing.txt and list*.txt
+cp OUTPUT_FILES/*.txt $BASEMPIDIR/
+##
+## forward simulation
+##
+# set up addressing
+#cp $BASEMPIDIR/addr*.txt OUTPUT_FILES/
+#cp $BASEMPIDIR/list*.txt OUTPUT_FILES/
+sleep 2
+echo
+echo `date`
+echo starting run in current directory $PWD
+echo
+mpirun -np $numcpus bin/xspecfem3D
+/bin/rm -rf DATABASES_MPI
+echo "finished successfully"
+echo `date`
+# Print time after running
+echo Time is `date`
+echo Walltime : $(expr `date +"%s"` - $begin)          # Seconds
+echo CPUtime  : $(squeue -j $SLURM_JOBID -o "%M" -h)   # HH:MM:SS
+</code>
+=== Run with GPU ===
+Example of slurm script (number of nodes should be adapted to the number of GPU per nodes and to NPROC_XI and NPROC_ETA:
+<code>
+#!/bin/bash
+#SBATCH -p pri2015gpu -A eost
+#SBATCH -N 3-3                 # Will use 3 nodes
+#SBATCH --tasks-per-node 8     # 8 tasks per node
+#SBATCH --gres=gpu:8           # We only want nodes with 8 GPUs
+#SBATCH --job-name=SPECFEM
+#SBATCH -t 12:00:00            # Wall time
+#SBATCH --cpu_bind=verbose
+# Load modules
+module purge
+module load batch/slurm
+module load compilers/cuda-7.5
+module load mpi/openmpi-basic
+# ID of each GPU (should be adapted if using a different number of GPUs)
+export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
+echo Master on host `hostname`
+echo Time is `date`
+# Start time
+begin=`date +"%s"`
+# Run
+BASEMPIDIR=`grep LOCAL_PATH DATA/Par_file | cut -d = -f 2 `
+NPROC_XI=`grep NPROC_XI DATA/Par_file   | cut -d = -f 2 `
+NPROC_ETA=`grep NPROC_ETA DATA/Par_file  | cut -d = -f 2`
+NCHUNKS=`grep NCHUNKS DATA/Par_file    | cut -d = -f 2 `
+numgpus=$(( $NCHUNKS * $NPROC_XI * $NPROC_ETA ))
+mkdir -p OUTPUT_FILES
+# backup files used for this simulation
+cp -pf DATA/Par_file OUTPUT_FILES/
+cp -pf DATA/STATIONS OUTPUT_FILES/
+cp -pf DATA/CMTSOLUTION OUTPUT_FILES/
+##
+## mesh generation
+##
+sleep 2
+echo
+echo `date`
+echo "starting MPI mesher on $numgpus processors"
+echo
+mpirun -np $numgpus bin/xmeshfem3D
+echo "  mesher done: `date`"
+echo
+# backup important files addressing.txt and list*.txt
+cp OUTPUT_FILES/*.txt $BASEMPIDIR/
+##
+## forward simulation
+##
+# set up addressing
+#cp $BASEMPIDIR/addr*.txt OUTPUT_FILES/
+#cp $BASEMPIDIR/list*.txt OUTPUT_FILES/
+sleep 2
+echo
+echo `date`
+echo starting run in current directory $PWD
+echo
+mpirun -np $numgpus bin/xspecfem3D
+/bin/rm -rf DATABASES_MPI
+echo "finished successfully"
+echo `date`
+# Print time after running
+echo Time is `date`
+echo Walltime : $(expr `date +"%s"` - $begin)          # Seconds
+echo CPUtime  : $(squeue -j $SLURM_JOBID -o "%M" -h)   # HH:MM:SS
+</code>
+==== Running simulations in parallel ====
+Some instructions to use custom scripts enabling parallel SEM simulations on the HPC cluster
+=== Preparing the input files ===
+First, create an event list "Events.txt" with 3 collumns:
+  * 1st column: event_id (will also be the name of the run directory)
+  * 2nd column: path to ''CMTSOLUTION'' file for this event
+  * 3nd column: path to ''STATION'' file for this event (can be the same for all events)
+Then you must setup a ''Par_file'' (be careful to use a version of ''Par_file'' that is compatible with your SEM version)
+Finally, you must setup hostfiles named ''nodelistN'' files where N=0,...,Np-1 (Np, the number of parallel SEM simulations). These files must specify host names and number of slots per node. Here is an example:
+<code>
+$ cat nodelist0
+hpc-n443 slots=8
+hpc-n444 slots=8
+hpc-n445 slots=8
+</code>
+(see ''/b/home/eost/zac/jobs/specfem/parallelSEM/nodelist0'')
+=== Running the simulations in parallel ===
+Parallel SEM simulations are handled using 3 scripts:
+  * ''parallelSEM.sh'': is the main script, that compiles the code and run the simulations
+  * ''run_gpu_nodelist.sh'' is the script used to run the mesher and solver
+  * ''sleep.slurm'' is a script to reserve the GPU nodes
+All these scripts are available in ''/b/home/eost/zac/jobs/specfem/parallelSEM''
+Before running your job, make sure that the input parameters in ''parallelSEM.sh'' are consistent with the input parameters stated above (see ''INPUT PARAMS'' in the main script). Specifically:
+  * ''SPECFEMDIR'': path to SPECFEM3D_GLOBE directory
+  * ''Par_file'': path to the Par_file used in simulations
+  * ''Nparallel'': Number of SEM simulations in parallel (make sure enough GPUs are available)
+  * ''event_list'': List of events with the format given above
+Then run your simulations:
+<code>
+./parallelSEM.sh
+</code>
+The script will make sure that the GPU nodes are available before launching SPECFEM3D_GLOBE.

Seismo Wiki

User Tools

Site Tools

Differences

Page Tools