This is an old revision of the document!
Load modules and declare environment variables as follows:
module purge module load batch/slurm module load compilers/cuda-7.5 export CUDA_INC=/usr/local/cuda/cuda-7.5/include export CUDA_LIB=/usr/local/cuda/cuda-7.5/lib64 export PATH=/rpriv/ipgs/zac/openmpi-1.10.7/bin:$PATH export LD_LIBRARY_PATH=/rpriv/ipgs/zac/openmpi-1.10.7/lib:$LD_LIBRARY_PATH
Notice that we use default gnu compiler of the operating system:
$ gfortran --version GNU Fortran (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
Before compilation, make sure that required modules are loaded and CUDA_LIB, CUDA_INC environment variables are declared (see previous section). Create a run directory including directories DATABASE_MPI
, OUTPUT_FILES
, bin
and DATA
.
In the directory DATA
, create CMTSOLUTION
, Par_file
and STATIONS
file (cf., SPECFEM3D_GLOBE documentation).
Change directory to the main specfem3D_globe directory. Copy the Par_file
of the run directory into DATA
of the main specfem3D_globe directory. You can then run configure:
./configure -with-cuda=cuda5
(the cuda5 option will enable more recent GPU compute capabilities).
Then compile the code:
make clean make all
Copy binaries to the run directory and backup setup files:
cp bin/xmeshfem3D $rundir/bin cp bin/xspecfem3D $rundir/bin cp setup/* $rundir/OUTPUT_FILES/ cp DATA/Par_file $rundir/OUTPUT_FILES/
Link model directories. For example with s362ani:
ln -s DATA/crust2.0 $rundir/DATA/crust2.0 ln -s DATA/s362ani $rundir/DATA/s362ani ln -s DATA/QRFSI12 $rundir/DATA/QRFSI12 ln -s DATA/topo_bathy $rundir/DATA/topo_bathy
Below is a script doing all the configuration and compilation:
#!/bin/bash # Load modules module purge module load batch/slurm module load compilers/cuda-7.5 module load mpi/openmpi-basic export CUDA_LIB=/usr/local/cuda/cuda-7.5/lib64 export CUDA_INC=/usr/local/cuda/cuda-7.5/include # source directory rootdir=/b/home/ipgs/cmorales/specfem3d_globe # setting up run directory currentdir=`pwd` mkdir -p DATABASES_MPI mkdir -p OUTPUT_FILES rm -rf DATABASES_MPI/* rm -rf OUTPUT_FILES/* # configure and compile in the source directory cd $rootdir # configure ./configure -with-cuda=cuda5 # compiles for a forward simulation cp $currentdir/DATA/Par_file DATA/Par_file make clean make all # backup of constants setup cp setup/* $currentdir/OUTPUT_FILES/ cp DATA/Par_file $currentdir/OUTPUT_FILES/ # Copy executables/Model in the current directory cd $currentdir # copy executables mkdir -p bin cp $rootdir/bin/xmeshfem3D ./bin/ cp $rootdir/bin/xspecfem3D ./bin/ # Links data necessary directories # The example below is for s362ani... this part should be changed if another model is used cd DATA/ ln -s $rootdir/DATA/crust2.0 ln -s $rootdir/DATA/s362ani ln -s $rootdir/DATA/QRFSI12 ln -s $rootdir/DATA/topo_bathy cd ../
Example of slurm script (number of CPU cores should be adapted to NPROC_XI and NPROC_ETA:
#!/bin/bash #SBATCH -p grant -A g2016a68 # Partition / Account #SBATCH -n 96 # Number of CPU cores #SBATCH --job-name=SPECFEM #SBATCH -t 23:00:00 # Wall time # Load modules module purge module load batch/slurm module load mpi/openmpi-basic # echo Master on host `hostname` echo Time is `date` # Start time begin=`date +"%s"` # Run BASEMPIDIR=`grep LOCAL_PATH DATA/Par_file | cut -d = -f 2 ` NPROC_XI=`grep NPROC_XI DATA/Par_file | cut -d = -f 2 ` NPROC_ETA=`grep NPROC_ETA DATA/Par_file | cut -d = -f 2` NCHUNKS=`grep NCHUNKS DATA/Par_file | cut -d = -f 2 ` numcpus=$(( $NCHUNKS * $NPROC_XI * $NPROC_ETA )) # numcpus should be consistent with the -n option mkdir -p OUTPUT_FILES # backup files used for this simulation cp DATA/Par_file OUTPUT_FILES/ cp DATA/STATIONS OUTPUT_FILES/ cp DATA/CMTSOLUTION OUTPUT_FILES/ ## ## mesh generation ## sleep 2 echo echo `date` echo "starting MPI mesher on $numcpus processors" echo mpirun -np $numcpus bin/xmeshfem3D echo " mesher done: `date`" echo # backup important files addressing.txt and list*.txt cp OUTPUT_FILES/*.txt $BASEMPIDIR/ ## ## forward simulation ## # set up addressing #cp $BASEMPIDIR/addr*.txt OUTPUT_FILES/ #cp $BASEMPIDIR/list*.txt OUTPUT_FILES/ sleep 2 echo echo `date` echo starting run in current directory $PWD echo mpirun -np $numcpus bin/xspecfem3D /bin/rm -rf DATABASES_MPI echo "finished successfully" echo `date` # Print time after running echo Time is `date` echo Walltime : $(expr `date +"%s"` - $begin) # Seconds echo CPUtime : $(squeue -j $SLURM_JOBID -o "%M" -h) # HH:MM:SS
Example of slurm script (number of nodes should be adapted to the number of GPU per nodes and to NPROC_XI and NPROC_ETA:
#!/bin/bash #SBATCH -p pri2015gpu -A eost #SBATCH -N 3-3 # Will use 3 nodes #SBATCH --tasks-per-node 8 # 8 tasks per node #SBATCH --gres=gpu:8 # We only want nodes with 8 GPUs #SBATCH --job-name=SPECFEM #SBATCH -t 12:00:00 # Wall time #SBATCH --cpu_bind=verbose # Load modules module purge module load batch/slurm module load compilers/cuda-7.5 module load mpi/openmpi-basic # ID of each GPU (should be adapted if using a different number of GPUs) export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 echo Master on host `hostname` echo Time is `date` # Start time begin=`date +"%s"` # Run BASEMPIDIR=`grep LOCAL_PATH DATA/Par_file | cut -d = -f 2 ` NPROC_XI=`grep NPROC_XI DATA/Par_file | cut -d = -f 2 ` NPROC_ETA=`grep NPROC_ETA DATA/Par_file | cut -d = -f 2` NCHUNKS=`grep NCHUNKS DATA/Par_file | cut -d = -f 2 ` numgpus=$(( $NCHUNKS * $NPROC_XI * $NPROC_ETA )) mkdir -p OUTPUT_FILES # backup files used for this simulation cp -pf DATA/Par_file OUTPUT_FILES/ cp -pf DATA/STATIONS OUTPUT_FILES/ cp -pf DATA/CMTSOLUTION OUTPUT_FILES/ ## ## mesh generation ## sleep 2 echo echo `date` echo "starting MPI mesher on $numgpus processors" echo mpirun -np $numgpus bin/xmeshfem3D echo " mesher done: `date`" echo # backup important files addressing.txt and list*.txt cp OUTPUT_FILES/*.txt $BASEMPIDIR/ ## ## forward simulation ## # set up addressing #cp $BASEMPIDIR/addr*.txt OUTPUT_FILES/ #cp $BASEMPIDIR/list*.txt OUTPUT_FILES/ sleep 2 echo echo `date` echo starting run in current directory $PWD echo mpirun -np $numgpus bin/xspecfem3D /bin/rm -rf DATABASES_MPI echo "finished successfully" echo `date` # Print time after running echo Time is `date` echo Walltime : $(expr `date +"%s"` - $begin) # Seconds echo CPUtime : $(squeue -j $SLURM_JOBID -o "%M" -h) # HH:MM:SS
To launch batch of SEM runs:
First, create an event list “Events.txt” with 3 collumns:
- 1st column: event_id (will also be the name of the run directory)
- 2nd column: path to CMTSOLUTION
file for this event
- 3nd column: path to STATION
file for this event (can be the same for all events)
Then you must setup a Par_file
as described above (be careful to use a version of Par_file
that is compatible with your SEM version)
The SPECFEM runs will be handled using 3 scripts:
- parallelSEM.sh
: is the main script
- run_gpu_nodelist.sh
is the script used to run the mesher and solver
- sleep.slurm
is a script to reserve the GPU nodes
Before running it, make sure the input parameters in parallelSEM.sh
are consistent with the input parameters stated above (see INPUT PARAMS
in the main script). Then run:
./parallelSEM.sh