Running Stata

STATA version 17.0 is available on Hopper. The licence allows for 15 users and 12 cores per run. There are two ways you can run STATA on Hopper - from the Open OnDemand Web Server or from a shell session on the Command Line / with a SLURM script.

Using STATA from Open OnDemand

To run STATA from Open OnDemand (OOD), log onto the OOD dashboard from a web browser. Select the STATA app, and after configuring the time, you should get the STATA GUI.

Command Line Interactive Session

1- After loggin into Hopper from the shell, to use STATA, navigate to where your files are stored in your /home or /scratch or /projects and start an interactive session with salloc:

salloc --nodes=1 --ntasks-per-node=12 --mem=50GB --time=0-1:00:00

The above command will allocate 12 cores on a single CPU node and 50GB memory for a duration of 1 hour.

2- Load the stata module:

module load stata

3- Start STATA with

stata

and this should gie you the STATA prompt:

[user@hopper1 ~]$ ml load stata
[user@hopper1 ~]$ stata

  ___  ____  ____  ____  ____ ®
 /__    /   ____/   /   ____/      17.0
___/   /   /___/   /   /___/       BE—Basic Edition

 Statistics and Data Science       Copyright 1985-2021 StataCorp LLC
                                   StataCorp
                                   4905 Lakeway Drive
                                   College Station, Texas 77845 USA
                                   800-STATA-PC        https://www.stata.com
                                   979-696-4600        stata@stata.com

Stata license: 15-user network, expiring 14 Sep 2022
  Licensed to: George Mason University
               Fairfax VA

Notes:
      1. Unicode is supported; see help unicode_advice.

._

Batch mode

To run STATA in batch mode, you need to create do-files which contain the series of commands you would otherwise run. With a do file (filename.do) in hand, you can run it from the shell in the command line with:

stata -b do filename

This will tell Stata to run in batch mode and execute the commands in filename.do. Outputs will be automatically saved to the outputfile filename.log.

Once your jobs are done, you can exit from the node back to the headnode by typing:

exit

For longer more intensive jobs, you can use SLURM scripts to run your STATA jobs. The interactive sessions (from the command line) as discussed above are useful for debugging purposes. Once your do-files are ready for longer runs, you can create slurm scripts. An example is shown below for a run on a single processor, run_Stata.slurm:

#!/bin/bash
#SBATCH   --partition=normal             # submit   to the normal(default) partition
#SBATCH   --job-name=stata-test          # name the job
#SBATCH   --output=stata-test-%j.out     # write stdout/stderr   to named file
#SBATCH   --error=stata-test-%j.err      
#SBATCH   --time=0-02:00:00              # Run for max of 02 hrs, 00 mins, 00 secs
#SBATCH   --nodes=1                      # Request N nodes
#SBATCH   --ntasks-per-node=1            # Request n   cores per node
#SBATCH   --mem-per-cpu=5GB              # Request nGB RAM per core

#load modules with  
module load stata  

#run stata
stata -b filename.do

To run across multiple processors (upto a maximum of 12), modify the number of tasks set in the script above and use the stata-mp executable:

#!/bin/bash
#SBATCH   --partition=normal            # submit   to the normal(default) partition
#SBATCH   --job-name=stata-parallel             # name the job
#SBATCH   --output=stata-parallel-%j.out        # write stdout/stderr   to named file
#SBATCH   --error=stata-parallel-%j.err      
#SBATCH   --time=0-02:00:00             # Run for max of 02 hrs, 00 mins, 00 secs
#SBATCH   --nodes=1                     # Request N nodes
#SBATCH   --ntasks-per-node=12            # Request n   cores per node
#SBATCH   --mem-per-cpu=5GB             # Request nGB RAM per core


#load modules with  
module load stata  

#run stata
stata-mp -b filename.do

This will run on the 'normal' partition which allows for upto 5 days run-time on available cpus.

You submit your SLURM script with

sbatch run_stata.slurm

Once submitted, you may log off from your terminal, and log back in later to retrieve the output.

Memory Considerations for STATA Jobs

You need to allocate enough memory to Stata. This is done by the command 'set memory #', where # is the amount of memory you want to allocate, for example:

        set memory 20000

or

        set memory 20m

Both of these commands will allocate 20 megabytes of memory.

There are two important considerations when deciding how much memory to allocate.

Make sure that you allocate an amount of memory that is larger than the file that you are using. A good rule of thumb for large files is to allocate roughly 50% more memory than the size of your file. For example, if your file is 100GB, set the memory to 150GB.
Make sure to match the set memory when setting the Memory parameter either in your SLURM script with the 'mem' parameter or in the Open Ondemand form when starting the STATA app. For larger data files, it is recommended to run on the 'bigmem' partition.