File Systems

NHR@ZIB is operating 3 central storage systems with their global file systems:

File System

Capacity 

Storage Technology and Function
HOME
20 PiB

IBM Storage Scale parallel file system (former GPFS) with 1 PiB NVMe SSD Cache on first write

  • includes centrally managed software and the module system in /sw
WORK
PERM
Tape archive with multiple petabyte capacity with additional harddisk caches

The system has additional storage options for high IO demands:

  • All nodes of the partition standard96:ssd have local SSDs for temporary data at $LOCAL_TMPDIR (2 TB per node). For more details refer to Special Filesystems.

LIFETIME

In general, we store all data for an extra year after the end of a test account/project. If not extended, the standard term of test account/project is one year.

HOME

Each user holds one HOME directory:

  • directory HOME=/home/${USER}
  • for a higher number of files
    • configuration files
    • source code and executables
  • limited quota
  • snapshots available

We take daily snapshots of the filesystem, which can be used to restore a former state of a file or directory. These snapshots can be accessed through the path /home/.snapshots or /sw/.snapshots.

WORK

The 'Storage Scale' based work filesystem /scratch is the main work filesystem for the systems. Each user can distribute data to different directories.

  • parallel input/output for production jobs
    • moderate number of files
    • transient nature of data
  • no backup, no disaster recovery
  • available directories
    • WORK=/scratch/usr/${USER}, for user data
    • project directory /scratch/projects/<projectID>, to collect and to share project data (please remember: no backup of the Lustre file system), see also hints on disk quota

We provide no backup of this filesystem. The storage system of Lise provides around 185 GiB/s streaming bandwith during the acceptance test. With higher occupancy, the effective (write) streaming bandwidth is reduced.

The storage system is hard-disk based with NVMe Cache on first write.

DEPRECATEDIf you are accessing a large file (1GiB+) from multiple nodes in parallel, please consider to activate striping of the file with the Lustre command lfs setstripe (specific to this file or for a whole directory, changes apply only for new files, so applying a new striping to an existing file requires a file copy) with a sensible stripe_count (recommendation: Lise up to 8) and a stripe_size, which is a multiple of the RAID6 fullstripe size and matches the IO sizes of your job.

A general recommendation for network filesystems is to keep the number of metadata operations for open and closing files, as well as checks for file existence or changes as low as possible. These operations often become a bottleneck for the IO of your job and on large clusters can easily overload the file servers.


PERM, tape archive

The magnetic tape archive provides additional storage for inactive data to free up space on the WORK or HOME filesystem. It is directly accessible on the login nodes..

  • directory /perm/${USER}
  • secure file system location on magnetic tapes
  • no solution for long-term data archiving
  • no guarantee for 10 years according to rules for good scientific practice

For reasons of efficiency and performance, small files and/or complex directory structures should not be transferred to the archive directly. Please aggregate your data to compressed tarballs or other archive containers with a maximum size of 5,5TiB before copying your data to the archive.

Related pages