Difference between revisions of "NAS"
|  (document controlling MPI process placement) | Conrad54418 (talk | contribs)   (→Running TotalView on NAS) | ||
| (14 intermediate revisions by 2 users not shown) | |||
| Line 1: | Line 1: | ||
| [[ Category: Compute Facilities]] | [[ Category: Compute Facilities]] | ||
| − | |||
| Wiki for information related to the '''NASA Advanced Supercomputing''' ('''NAS''') facility. | Wiki for information related to the '''NASA Advanced Supercomputing''' ('''NAS''') facility. | ||
| Line 27: | Line 26: | ||
| |   | |   | ||
| | Merope | | Merope | ||
| + | | Compute | ||
| + | |- | ||
| + | |  | ||
| + | | Aitken | ||
| | Compute | | Compute | ||
| |- | |- | ||
| Line 37: | Line 40: | ||
| |   | |   | ||
| |} | |} | ||
| − | |||
| == How-To's == | == How-To's == | ||
| Line 51: | Line 53: | ||
| It is recommended to mirror the directory structure of your <code>/nobackup/$USER</code> directory on <code>lfe</code> to allow for the data to be easily recovered back to it's original state. This is especially important if you use symlinks (as they are path dependent and will break if either the source file or the symlink itself are not in the correct location). | It is recommended to mirror the directory structure of your <code>/nobackup/$USER</code> directory on <code>lfe</code> to allow for the data to be easily recovered back to it's original state. This is especially important if you use symlinks (as they are path dependent and will break if either the source file or the symlink itself are not in the correct location). | ||
| − | This can be done with <code>scp</code>, but is recommended to use NASA's in-house utility <code>shiftc</code>. <code>shiftc</code> will automatically perform parallel file transfers, data integrity checks and repairs, and syncing features similar to <code>rsync</code>. | + | This can be done with <code>scp</code>, but it is recommended to use NASA's in-house utility <code>shiftc</code>. <code>shiftc</code> will automatically perform parallel file transfers, data integrity checks and repairs, and syncing features similar to <code>rsync</code>. | 
| '''Commands:''' | '''Commands:''' | ||
| Line 89: | Line 91: | ||
| Note that <code>mbind.x</code> is also socket aware, so it will distribute nodes evenly between nodes ''and'' between CPU's in each node (NAS nodes have 2 CPU's per node). | Note that <code>mbind.x</code> is also socket aware, so it will distribute nodes evenly between nodes ''and'' between CPU's in each node (NAS nodes have 2 CPU's per node). | ||
| For more information on <code>mbind.x</code>, see it's help flag (<code>mbind.x -help</code>) or [https://www.nas.nasa.gov/hecc/support/kb/using-the-mbind-tool-for-pinning_288.html NAS's documentation website]. | For more information on <code>mbind.x</code>, see it's help flag (<code>mbind.x -help</code>) or [https://www.nas.nasa.gov/hecc/support/kb/using-the-mbind-tool-for-pinning_288.html NAS's documentation website]. | ||
| + | |||
| + | === Common commands === | ||
| + | |||
| + | * <code>node_stats.sh</code>: Displays how many nodes are available or actively running jobs | ||
| + | * <code>tracejobssh</code>: Helps to answer "Why isn't my job running?". Part of the [https://github.com/PHASTA/utilities git repo]. | ||
| + | |||
| + | === See Priority "Score" in Queue === | ||
| + | |||
| + | To see what your priority "score" in PBS is use the <code>qstat -W o=+pri</code> to add the "Priority" column to the output of <code>qstat</code>. | ||
| + | |||
| + | ==== Priority Scoring (as of 2021-01-22) ==== | ||
| + | |||
| + | * Job priority score grows by 1 every 12 hours | ||
| + | * We are capped at a max score of 20 per job | ||
| + | ** Note that other users/groups using NAS may start with higher priority and grow higher than 20 | ||
| + | ** Result is that it's quite difficult to get large jobs running | ||
| + | * If you don't have any jobs running, you get an addition +10 to the score | ||
| + | ** This score bump is removed as soon as you have a running job | ||
| + | |||
| + | === Compiling === | ||
| + | * Generally will want use <code>module load hpe-mpi/mpt comp-intel</code> for compiling | ||
| + | * Sometimes, <code>mpi{cc,cxx,f90}</code> will not pick the Intel compilers by default. You can check this by running <code>mpi{cc,cxx,f90} --version</code> to verify the compiler it links to. | ||
| + | ** To fix this, you can set <code>export MPICC_CC=icc MPICXX_CXX=icpc MPIF90_F90=ifort</code> to force it to use the Intel compilers | ||
| + | |||
| + | === Running TotalView on NAS === | ||
| + | |||
| + | Using TotalView on NAS starts with adding a configuration file which will enable port forwarding on the viz nodes. On the viz nodes, run these commands: | ||
| + | |||
| + |  cd ~/.ssh/ | ||
| + |  cp ~kjansen/sshconfig config | ||
| + | |||
| + | Now, login to NAS sfe with X forwarding: | ||
| + | |||
| + |  ssh -X [nas username here]@sfe6.nas.nasa.gov | ||
| + | |||
| + | You'll need to enter your password and NAS secure passcode. Next transfer to NAS pfe, also with X forwarding: | ||
| + | |||
| + |  ssh -X pfe | ||
| + | |||
| + | You'll need to again enter a NAS secure passcode. The next step is optional. If you want to use the older TotalView interface that looks like the one on the viz nodes, run this command: | ||
| + | |||
| + |  echo false > ~/.totalview/.tvnewui | ||
| + | |||
| + | Now start an interactive job on NAS pfe with the following command: | ||
| + | |||
| + |  qsub -X -I -q devel -lselect=1:ncpus=8:model=sky_ele,walltime=2:00:00 | ||
| + | |||
| + | The number of cpus (8) should match the number of processes for the case you'll be debugging. Now on the interactive session on NAS pfe, load the correct environment that matches the build of your code. In my case it was: | ||
| + | |||
| + |  module load mpi-hpe/mpt | ||
| + |  module load comp-intel/2020.4.304 | ||
| + | |||
| + | It is important to note that the build of PHASTA on NAS that you want to debug must be built as the debug version. This will allow you to place breakpoints in the code using TotalView. PHASTA can be built as the debug version by setting the "-DCMAKE_BUILD_TYPE=Debug \" flag in the cmake file. Now, navigate to the directory in nobackup where you have your case setup to run. Define the path to the PHASTA build by executing this command: | ||
| + | |||
| + |  export PHASTA_CONFIG=[path to PHASTA build] | ||
| + | |||
| + | Now load the TotalView module: | ||
| + | |||
| + |  module load totalview/2023.4.16 | ||
| + | |||
| + | You may need to manually remove the "doubleRun-check" folder from the "procs_case" folder before running PHASTA. Now, to run PHASTA and debug with TotalView, execute this command: | ||
| + | |||
| + |  totalview mpiexec_mpt.real -a -np 8 $PHASTA_CONFIG/bin/phastaC.exe | ||
| + | |||
| + | A TotalView feature that is useful while running on NAS is "rescan libraries" which can be found in the "File" drop-down menu. You can select this option after recompiling the PHASTA code which will allow you to restart the job in TotalView without having to close and reopen the application, saving time. | ||
| + | |||
| + | With that, you should be all set! For more information about using TotalView on NAS you can visit this website: https://www.nas.nasa.gov/hecc/support/kb/totalview_95.html | ||
Latest revision as of 12:39, 2 May 2024
Wiki for information related to the NASA Advanced Supercomputing (NAS) facility.
Contents
Overview
| Key | Value | Notes | 
|---|---|---|
| Machines | Pleiades | Compute | 
| Lou | Storage and Analysis | |
| Electra | Compute | |
| Endeavour | Compute | |
| Merope | Compute | |
| Aitken | Compute | |
| Job Submission System | PBS | |
| Facility Documentation | Support Knowledgebase | 
How-To's
How-To's in Separate Wiki's
Backup Data from Scratch Directories
This is done simply by copying data from the /nobackup/$USER directories to your home directory on Lou (lfe). The /nobackup/$USER directories are mounted onto lfe, so transfers should be done on lfe.
It is recommended to mirror the directory structure of your /nobackup/$USER directory on lfe to allow for the data to be easily recovered back to it's original state. This is especially important if you use symlinks (as they are path dependent and will break if either the source file or the symlink itself are not in the correct location).
This can be done with scp, but it is recommended to use NASA's in-house utility shiftc. shiftc will automatically perform parallel file transfers, data integrity checks and repairs, and syncing features similar to rsync.
Commands:
jrwrigh7@lfe7: shiftc -r -d --sync /nobackup/jrwrigh7/models/STGFlatPlate/STFM_Tet_dz4-10_dx15 .
This will copy the directory STFM_Tet_dz4-10_dx15 to the current location (.). The flags do as follows
-  -r: Recursively copy files from destination
-  -d: Create required directories that don't already exist. Equivalent of the-pflag formkdir
-  --sync: Only copy over "new" files, where "new" are any changes to the modification time or file size.-  If a file exists on destination (.), but not source (STFM_Tet_dz4-10_dx15), it will not be copied back to source nor will it be deleted to match the state of source.
 
-  If a file exists on destination (
Once this command is submitted, the transfer process will be backgrounded. Progress can be viewed by running shiftc --monitor. Additionally, you will recieve an email with the transfer job is completed.
jrwrigh7@lfe7: shiftc --stop --id [shiftc job ID]
This will stop the given shiftc job. The [shiftc job ID] is the same number that appears beside the output of shiftc --monitor. 
More documentation for shiftc can be found in its man page (man shiftc) and on NAS's documentation website.
Control MPI Rank Placement
Rank 1 Solo Node
To make the rank 1 MPI process take a node on it's own, put this in the PBS directives:
#PBS -l select=1:mpiprocs=1:model=sky_ele+1:mpiprocs=40:model=sky_ele
This will request 2 nodes: One will have the rank 1 process all by itself, and the other will have 40 MPI Processes (for all 40 CPU cores available on sky_ele nodes). 
Distribute Non-First Rank MPI Processes
For controlling the placement of non-first rank MPI processes, use the mbind.x utility.
For example, if we have requested 4 nodes and want 10 MPI processes per node, the mpiexec command needs to be modified to the following:
mpiexec -np 40 /u/scicon/tools/bin/mbind.x -n10 [executable]
Note that mbind.x is also socket aware, so it will distribute nodes evenly between nodes and between CPU's in each node (NAS nodes have 2 CPU's per node).
For more information on mbind.x, see it's help flag (mbind.x -help) or NAS's documentation website.
Common commands
-  node_stats.sh: Displays how many nodes are available or actively running jobs
-  tracejobssh: Helps to answer "Why isn't my job running?". Part of the git repo.
See Priority "Score" in Queue
To see what your priority "score" in PBS is use the qstat -W o=+pri to add the "Priority" column to the output of qstat.
Priority Scoring (as of 2021-01-22)
- Job priority score grows by 1 every 12 hours
-  We are capped at a max score of 20 per job
- Note that other users/groups using NAS may start with higher priority and grow higher than 20
- Result is that it's quite difficult to get large jobs running
 
-  If you don't have any jobs running, you get an addition +10 to the score
- This score bump is removed as soon as you have a running job
 
Compiling
-  Generally will want use module load hpe-mpi/mpt comp-intelfor compiling
-  Sometimes, mpi{cc,cxx,f90}will not pick the Intel compilers by default. You can check this by runningmpi{cc,cxx,f90} --versionto verify the compiler it links to.-  To fix this, you can set export MPICC_CC=icc MPICXX_CXX=icpc MPIF90_F90=ifortto force it to use the Intel compilers
 
-  To fix this, you can set 
Running TotalView on NAS
Using TotalView on NAS starts with adding a configuration file which will enable port forwarding on the viz nodes. On the viz nodes, run these commands:
cd ~/.ssh/ cp ~kjansen/sshconfig config
Now, login to NAS sfe with X forwarding:
ssh -X [nas username here]@sfe6.nas.nasa.gov
You'll need to enter your password and NAS secure passcode. Next transfer to NAS pfe, also with X forwarding:
ssh -X pfe
You'll need to again enter a NAS secure passcode. The next step is optional. If you want to use the older TotalView interface that looks like the one on the viz nodes, run this command:
echo false > ~/.totalview/.tvnewui
Now start an interactive job on NAS pfe with the following command:
qsub -X -I -q devel -lselect=1:ncpus=8:model=sky_ele,walltime=2:00:00
The number of cpus (8) should match the number of processes for the case you'll be debugging. Now on the interactive session on NAS pfe, load the correct environment that matches the build of your code. In my case it was:
module load mpi-hpe/mpt module load comp-intel/2020.4.304
It is important to note that the build of PHASTA on NAS that you want to debug must be built as the debug version. This will allow you to place breakpoints in the code using TotalView. PHASTA can be built as the debug version by setting the "-DCMAKE_BUILD_TYPE=Debug \" flag in the cmake file. Now, navigate to the directory in nobackup where you have your case setup to run. Define the path to the PHASTA build by executing this command:
export PHASTA_CONFIG=[path to PHASTA build]
Now load the TotalView module:
module load totalview/2023.4.16
You may need to manually remove the "doubleRun-check" folder from the "procs_case" folder before running PHASTA. Now, to run PHASTA and debug with TotalView, execute this command:
totalview mpiexec_mpt.real -a -np 8 $PHASTA_CONFIG/bin/phastaC.exe
A TotalView feature that is useful while running on NAS is "rescan libraries" which can be found in the "File" drop-down menu. You can select this option after recompiling the PHASTA code which will allow you to restart the job in TotalView without having to close and reopen the application, saving time.
With that, you should be all set! For more information about using TotalView on NAS you can visit this website: https://www.nas.nasa.gov/hecc/support/kb/totalview_95.html
