cloudy-agn/notes/thor_guide
2017-08-07 15:31:56 -04:00

69 lines
4.5 KiB
Plaintext

Cloudy on Thor User Guide
MPI: Message passing interface
Thor is an MPI-enabled computational cluster owned and administered by the College of Engineering and Applied Sciences, Department of Computer Science, at WMU. It is available to faculty and students for research and development purposes. It provides ~16 research nodes, 2 student nodes, and a few nodes dedicated to specific projects. More specifics can be found on the lab's home page. Access to Thor is available using SSH. An account must be established by a system administrator, first. Most likely, one's account name will be one's WMU network login.
HPCS Homepage: https://cs.wmich.edu/~hpcs/
Thor's hostname: thor.cs.wmich.edu
SysAdmin: Currently not filled or unknown
To use some software on Thor (Notably, the compilers.), one must be aware that available libraries are provided through a modules system. One must load the module that contains the program one wants. The pertinent commands are:
module avail (show available modules)
module list (show currently loaded modules)
module load <module> (load a module)
For use in this guide, one need load the gcc module with the following command. If version 4.7.3 is no longer available, use module avail to find a compatible version of gcc. An MPI library must also be loaded.
module load gcc/6.3.0
module load openmpi/gnu/1.7.3
To utilize MPI functionality, Cloudy must be compiled with MPI support. Cloudy attempts to provide a simple interface through which to accomplish this. Cloudy provides a system of makefiles for compiling the software with many compatibilities. These makefiles are available under the subdirectories of the c13.03/source directory. In this case, sys_mpi_gcc should be used. Provided the gcc and MPI modules are loaded in a Thor environment, the following sequence should compile Cloudy with MPI support.
(From the Cloudy root directory, which was c13.03/ at the time of this writing.)
cd source/sys_mpi_gcc
make
The executable file should now be available. MPI programs aren't intended to be run directly, but rather submitted to the scheduler for processing. Thor uses the scheduler Torque. Some pertinent commands are:
qstat (show currently running jobs)
qnodes (show information about available nodes -- quite verbose)
qsub <script name> (submit a script to the scheduler)
A trick to display only one user's running jobs is to pipe qstat to a grep of the username:
qstat|grep gjf5176
More information about Torque and the available nodes is available on the HPCS website.
To properly submit jobs to Torque, one uses a Torque script. An example script is provided with this guide. The options appearing after #PBS on header lines are communicated to the scheduler; this is how specific resources are requested. The scheduler will wait to start the job if the requested resources are not available. It will wait until the end of the wall time if there are not enough resources in the system. This script is optimized for the running conditions on Thor circa Summer 2015. It provides 64 processors, which translates to 64 concurrent Cloudy simulations under c13.03's parallel-processing paradigm.
#PBS -l mem=128GB (amount of RAM per node)
#PBS -l nodes=4:ppn=16:research (number of nodes, processors per node, node type)
#PBS -l walltime=150:00:00 (maximum time for job to complete)
#PBS -N .25_181515_cldn23 (name of job)
#PBS -j oe (I forget; these are default in the scripts provided from the lab)
Two variables are provided for easy reuse of the script. Take note when editing RUN_FILE, Cloudy's MPI option takes the filename without the ".in" suffix, whereas the filename ending in ".in" must be provided when running Cloudy in serial mode. In this case, the filename Cloudy will look for is "mpi_grid.in". RUN_DIR is assumed to exist under one's home directory. ~/cloudy/cloudy_mpi will also need to be replaced with the desired path of the cloudy MPI executable file.
RUN_FILE="mpi_grid"
RUN_DIR="cloudy/runs/4thdex_181515models_23cldn"
To execute a Cloudy MPI run, place a copy of the Torque script along with a Cloudy input script in a directory. Update the variables in the Torque script so they point to the Cloudy input script. Then execute:
qsub cloudy_grid.pbs
If all goes well, one should see a new job running in the list by executing
qstat
The directory itself should fill up with input and output files for each point in the grid. Cloudy will clean and compile these into a single output file at the end.
Voila!
Some useful commands:
qnodes | grep -B2 research| less
qstat