Special Filesystems
Finding the right File System
If your jobs have a significant I/O part we recommend asking your consultant via support@nhr.zib.de to recommend the right file system for you.
Local disk I/O
Some Compute Nodes have local SSD or NVMe storage, available in $LOCAL_TMPDIR
An empty directory is created on job start, and data will be deleted after the job is finished. Local data can not be shared across nodes.
This is the best performing file system to use for data that doesn't need to be shared.
Partition | Local Storage |
---|---|
cpu-genoa | 3.8 TB |
cpu-clx:large | 2 TB |
cpu-clx:huge | 2 TB |
cpu-clx:ssd | 2 TB |
Local disks are also available on all login nodes under /local
. Files there are removed after 30 days.
Typically, depending on your I/O pattern, it is much faster if data is first collected/packed node-locally on $LOCAL_TMPDIR
and then copied together over the network to $WORK in a second step. In contrast, if all MPI tasks of a node communicate individually with the remote storage servers it can be a bottleneck.
The environment variable$LOCAL_TMPDIR
exits on all compute nodes. On compute nodes without SSD/NVMe (see Local Disks section below) $LOCAL_TMPDIR
points to the /tmpfs filesystem. In this case, its capacity is capped around 50% of the total RAM - see:df -h $LOCAL_TMPDIR
Please note that after your Slurm job finishes all data on $LOCAL_TMPDIR
will be removed (cleaned for the next user), so you need to copy it to another location before.
An example of how to make use of local I/O is given under Ex. moving local data parallel to program execution.
Global IO
Global IO is defined as shared IO which will be able to be accessed from multiple nodes at the same time and will be persistent after job end. See File Systems