Differences Between Argo and Hopper
Argo and Hopper have many similarities and important differences that users must be aware of.
Similarities between Argo and Hopper
Shared Storage
Both Argo and Hopper clusters share the same filesystems that allows users to work seamlessly across the two clusters without having to transfer files between them.
/home
- is mounted on both and subject to the same quota limits/scratch
- is mounted on both and subject to the same purging policies/projects
- is mounted on both Both Argo cluster and Hopper cluster use Slurm for job scheduling. Software Provisioning
SLURM for Job Scheduling
Both Argo cluster and Hopper cluster use SLURM for job scheduling.
Modules for Software Provisioning
Both the clusters use modules to provide users dynamic access to software installed on the clusters . Argo cluster uses Environment modules whereas Hopper cluster uses Lmod .
Containers
We will provide Singularity containers of many applications as an alternative or complement to applications you might have been running natively.
You will also be able to use PodMan to build Docker containers or convert them to Singularity containers to run on Hopper .
Differences between Argo and Hopper
Hardware
Argo | Hopper | |
---|---|---|
CPUs | Intel E5-2650 v4 | Dell PowerEdge R640 |
Intel Skylake | Intel(R) Xeon(R) Gold 6240R | |
AMD MILAN | ||
GPUs | NVIDIA K80 | NVIDIA A100-80GB |
NVIDIA V100 | NVIDIA A100-40GB |
Software
Argo | Hopper | |
---|---|---|
OS | RHEL 7 | RHEL 8 |
Interactive Sessions
Argo
Most of the interactive/graphical computing on Argo was done using user-managed applications like Jupyter Notebooks, Matlab, R with port forwarding and/or X11 forwarding.
Hopper
Open OnDemand (OOD) is installed on Hopper to enable users to run applications like Jupyter notebook, Mathlab, Mathematica, Python from a web interface. To access the OOD web server, point your browser to https://ondemand.orc.gmu.edu.
SLURM Settings
Partition Names
The partition names and defaults on the Hopper cluster are different from the Argo cluster. Please see the table below for partition equivalence.
Argo Partition | Timelimit | Allowed QoS | - | Hopper Partition | Timelimit | Allowed QOS |
---|---|---|---|---|---|---|
gpuq | 5-00:00:00 | All | - | gpuq | 5-00:00:00 | gpu |
all-LoPri | 5-00:00:00 | All | - | normal | 7-00:00:00 | All |
all-HiPri | 12:00:00 | All | - | bigmem | 7-00:00:00 | All |
bigmem-HiPri | 12:00:00 | All | - | gpuq-contrib | 5-00:00:00 | hantil, ksun |
bigmem-LoPri | 5-00:00:0 | All | - | contrib* | 7-00:00:00 | qtong, normal |
all-long | 10-00:00:0 | All | - | interactive | 12:00:00 | interactive |
bigmem-long | 10-00:00:0 | All | - | debug | 1:00:00 | All |
contrib | 7-00:00:0 | contrib | - |
The default partition on Argo cluster is all-HiPri
with a 12 hour runtime limit while on the Hopper Cluster the default partition is normal
with a 5 day runtime limit. Please note the different runtime limits for the other partitions in the table above.
Note: Being a contributor allows users to submit to contrib partition. On Hopper cluster users need to be aware that they can submit jobs to the contrib partition under the condition that their jobs can be preempted by a contributor's job.
The interactive
and debug
partitions on Hopper cluster have no equivalents on Argo cluster.
You can use the sinfo
command to get information about the partitions on Hopper.
- debug - short test jobs can be submitted to the
debug
partition for a quick turnaround. - normal - this is the default partition.
- interactive - jobs submitted via Open OnDemand (OOD) with
--QOS=interactive
run in this partition - gpuq - GPU jobs need ton be submitted to the
gpuq
with--QoS=gpu
- contrib - contributors to our condo model can submit to the
contrib
partition with--QoS=<group_account_name>
. Non-contributors can submit jobs to thecontrib
partition with--QoS=normal
, but their jobs can be preempted by ones from contributors.
QoS (Quality of Service)
Argo
$ sacctmgr list qos format=Name,Priority,MaxWall,MaxTres,MaxTRESPU
Name Priority MaxWall MaxTRES MaxTRESPU
---------- ---------- ----------- ------------- -------------
normal 0 cpu=1800 cpu=1024,gre+
cdsqos 0 cpu=280 cpu=280,gres+
testqos 0 cpu=500 cpu=350,gres+
contrib 100 cpu=28 cpu=28,gres/+
Hopper
$ sacctmgr list qos format=Name,Priority,MaxWall,MaxTres,MaxTRESPU
Name Priority MaxWall MaxTRES MaxTRESPU
---------- ---------- ----------- ------------- -------------
normal 0
qtong 1
interacti+ 0
orc-test 0 cpu=3000,gre+
gpu 0 gres/gpu=20
amd-test 0
gpu-test 0 cpu=224,gres+ cpu=224,gres+
hantil 1
ksun 1
class 0
Preemption
Preemption is enforced on the contrib
partition both on Argo and Hopper.
Modules
Even though both clusters use modules to provision software, the Environmental Modules are used on Argo and LMod used in Hopper have some key differences.For more information on modules check our page on Environment modules
Argo
Argo uses a flat
module scheme that displays all modules available on the system at all time. When
you search a package via module avail <package_name>
, you will see modules matching
<package_name>
regardless of their dependencies.
Hopper
Hopper uses a hierarchical
module scheme that displays only modules which are compatible with the particular
compiler and MPI library you have loaded at a given time to avoid incompatibilities. Therefore,
searching a package via module avail <package_name>
will not necessarily show you all available
versions of that package. A more comprehensive way of searching for packages is using the module
spider <package_name>
command. This will report all available packages matching <package_name>
.
Note:
On Hopper, you cannot load more than one version of the same application. For example, you can
not load python/3.6.8 and python/3.8.6 at the same time. Loading one will automatically unload the
other. In the rare cases where it is absolutely necessary to load more than one compiler or MPI
library at a time, you can set export LMOD_EXPERT=1
to enable that feature.
When you list the available modules on Hopper, you will see applications grouped as follows:
Core
- essential system modulesIndependent
- packages without particular compiler or MPI library dependence. These are often applications packaged as pre-compiled static binaries<COMPILER>
- packages built using the given<COMPILER>
<Compiler>_<MPI-library>
- packages built using the given<COMPILER>
and<MPI-library>
The default compiler and MPI library on Hopper are GNU/9.3.0
and OpenMPI/4.0.4.
Therefore, you will see these groups of modules when executing module avail
- Global Aliases
- GNU-9.3.0_OpenMPI-4.0.4
- GNU-9.3.0
- Independent
- Core
Base/Default Modules
Argo
No modules are loaded by default at login on Argo unless you explicitly load them via your startup shell script.
Hopper
Since modules available to you depend on the compiler and MPI library loaded in your environment, Hopper loads a set of default modules including a default compiler (gnu9/9.3.0) and MPI library (openmpi4/4.0.4).
Warning: If you load modules or set up other environmental variables using your startup scripts on Argo, you will likely get errors and warning messages when logging into Hopper because those modules do not exist on Hopper with the same name.
To avoid these issues, you can wrap some of the logic in your startup scripts to behave differently based on the cluster. Such logic would look like this:
# load the proper set of modules based on the cluster
export CLUSTER=`sacctmgr -n show cluster format=Cluster|xargs`
export CNODE=`hostname -s`
if [ ${CLUSTER} == "argo" ]; then
module load <ARGO_MODULE_NAME...>
source <ARGO_FILE...>
export ARGO_ENVIRONMENT=...
...
else [ ${CLUSTER} == "hopper" ]
Module naming
Argo
Modules are generally named as <package_name>/<package_version>
on Argo.
Hopper
Depending on the source of the package, Lmod modules on Hopper can have longer names and aliases.
- Spack-built packages are named
<package_name>/<package_version>-<two-character-hash>
- Some packages have useful aliases such as
<package_name>/<package_version>-<two-character-hash> (mixed-precision)
- Some important packages such as compilers, MPI and math libraries have global aliases appearing at the top when you execute
module avail
Searching for modules
Note:
On Hopper, module spider
searches the whole module tree to find matching
modules whereas module avail
will only search modules built with your currently loaded compiler and MPI
library.
How you search for modules in Hopper is very different from Argo and we will use an example to demonstrate this key difference.
Let's take the package nwchem
to demonstrate different compilers.
Argo
You can easily see that there are two versions of nwchem
.
$ module avail nwchem
------------------------ /cm/shared/modulefiles ----------
nwchem/intel/6.8.1 nwchem/intel/7.0.2
Hopper
You can initially see that there is one version of nwchem
built using GNU/9.3.0 compiler and OpenMPI/4.0.4 MPI library.
$ module avail nwchem
-------------------------------- Global Aliases --------------------------------
compiler/gnu/10.3.0 -> gnu10/10.3.0-ya
compiler/gnu/9.3.0 -> gnu9/9.3.0
compiler/intel/2020.2 -> intel/2020.2
compiler/intel/2022.0.2 -> compiler/2022.0.2
math/intel-mkl/2020.2 -> mkl/2020.2
math/intel-mkl/2022.0.2 -> mkl/2022.0.2
math/openblas/0.3.20 -> openblas/0.3.20-iq
math/openblas/0.3.7 -> openblas/0.3.7
mpi/intel-mpi/2020.2 -> impi/2020.2
mpi/intel-mpi/2021.5.1 -> mpi/2021.5.1
mpi/openmpi/4.0.4 -> openmpi4/4.0.4
mpi/openmpi/4.1.2 -> openmpi/4.1.2-4a
openmpi4/4.1.2 -> openmpi/4.1.2-4a
--------------------------- GNU-9.3.0_OpenMPI-4.0.4 ----------------------------
nwchem/7.0.2-m4
Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".
module spider
comes in.
$ module avail nwchem
----------------------------------------------------------------------------
nwchem:
----------------------------------------------------------------------------
Versions:
nwchem/6.8.1-ip
nwchem/7.0.2-m4
nwchem/7.0.2-mr
----------------------------------------------------------------------------
For detailed information about a specific "nwchem" package (including how to load the modules) use the module's full name.
Note that names that have a trailing (E) are extensions provided by other modules.
For example:
$ module spider nwchem/7.0.2-mr
----------------------------------------------------------------------------
You can now see there are three versions of nwchem
on Hopper.
If you randomly try to load one of these modules, you may get an error message like this:
$ module load nwchem/6.8.1-ip
LMod has detected the following error: These module(s) or extension(s) exist but cannot be loaded as requested: "nwchem/6.8.1-ip"
Try: "module spider nwchem/6.8.1-ip" to see how to load the module(s).
To see how you can load any one of them, you can run module spider
on any particular version.
$ module spider nwchem/6.8.1-ip
----------------------------------------------------------------------------
nwchem: nwchem/6.8.1-ip
----------------------------------------------------------------------------
You will need to load all module(s) on any one of the lines below before the "nwchem/6.8.1-ip" module is available to load.
hosts/hopper intel/2020.2 impi/2020.2
Help:
High-performance computational chemistry software
The output above is telling us that to load this module, you would need the compiler and MPI library
it was built with, namely intel/2020.2
and impi/2020.2
.
$ module load intel/2020.2 impi/2020.2
LMod is automatically replacing "gnu9/9.3.0" with "intel/2020.2".
LMod is automatically replacing "openmpi4/4.0.4" with "impi/2020.2".
$ module load nwchem/6.8.1-ip
$ module list
Currently Loaded Modules:
1) use.own 3) prun/2.0 5) intel/2020.2
2) autotools 4) hosts/hopper 6) impi/2020.2
Basic LMod Usage
The table below summarizes the most commonly used LMod commands. Please note that you can use ml
as an alias or shortcut to module
Module Command | Description |
---|---|
ml avail | List available modules |
ml list | Show modules currently loaded |
ml load/add package | Load a selected module* |
ml +package | Load a selected module* |
ml unload/rm package | Unload a previously loaded module |
ml -package | Unload a previously loaded module |
ml swap package1 package2 | Unload package1 and load package* |
ml purge | Unload all loaded modules |
ml reset | Reset loaded modules to system defaults |
ml display/show package | Display the contents of a selected module |
ml spider | List all modules (not just available ones) |
ml spider package | Display description of a selected module |
ml keyword key | Search for available modules by keyword |
ml, module help | Display help, usage information for modules |
ml use path | Add path to the MODULPATH (module search path) |
ml unuse path | Remove path from the MODULPATH (module search path) |
Note:
We have enabled the autoswap
feature in Lmod such that loading a package while a conflicting package is loaded will automatically swap the modules. For example, tying to load python/3.8.6
while python/3.7.6
is loaded will automatically swap python/3.7.6
for python/3.8.6
. Without the autoswap
feature, you would have had to manually unload python/3.7.6
and load python/3.8.6
.
Click on the following clip to get the basic look and feel of Lmod in Hopper.