3. Storage

Users may have different storage requirements and HPC provides 4 storage systems that users can take advantage of.

Caution

Backing up is a user’s own responsibility. E.g., if a user deleted something accidentally, it can not be recovered, unfortunately although you could try looking in Trash Bin.

3.1. File Systems

The NYU HPC clusters are served by a General Parallel File System (GPFS) cluster.

The NYU HPC team supports data storage, transfer, and archival needs on the HPC clusters, as well as collaborative research services like the Research Project Space (RPS).

Space

Environment
Variable

Space Purpose

Backed Up /
Flushed

Quota
Disk Space / # of Files

/home

$HOME

Personal user home space
that is best for small files

YES / NO

50 GB / 30 K

/scratch

$SCRATCH

Best for large files

NO / Files not
accessed for 60 days

5 TB / 1 M

/archive

$ARCHIVE

Long-term storage

YES / NO

2 TB / 20 K

HPC Research
Project Space

NA

Shared disk space for
research projects

YES / NO

Payment based
TB-year/inodes-year

Caution

Running jobs from /home is not recommended. /home SSDs are not designed for this purpose, it will kill the SSDs quickly.

Warning

Pay attention to data retention / flush cycles.

In short, you should

  • Put all your data to /scratch/<NetID> and run your jobs from there.

  • Only a small persistent fraction to /home/<NetID> (e.g., source code, executables).

  • For long-term storage, archive them to /archive/<NetID>.

Users should clean up their storage regularily.

Tip

You can access any file system using environment variable or its absolute path, for example both statements are equivalent:

cd $HOME
cd /home/$USER

3.2. Quota Utilization

Users can check their storage utilization by running:

myquota

Example output:

                    DISK SPACE                # FILES (1000's)
filesystem       size      quota            number      quota
            --------------------------   --------------------------
/home         131KB     20GB   ( 0%)           0        150 ( 0%)
/scratch      220GB     5242GB ( 4%)           4        500 ( 1%)

3.3. Data Transfer

The recommended way to transfer data to/from HPC systems is via OpenOnDemand HPC web interface.

  • Connect to VPN (If not on NYU network).

  • Logon to OOD website https://ood.hpc.nyu.edu.

  • Go to Files tab on top.

  • From there you can transfer files as you wish.

OOD Files

Tip

If you use VS Code, then you can directly use Explorer tab of VS Code to transfer files by just copy pasting to/from Explorer tab.

3.4. HPC Storage Status

On the Dashboard you can find current utilization and allocation data for the following file system mounts

  • GPFS file system: /home, /scratch, /archive

  • VAST file system: /vast

  • HDFS file system of Hadoop cluster Peel

Dashboard: Storage NYU HPC

Note

You need to be on NYU Network or connected to NYU VPN.