Inhalt |
---|
HLRN NHR provides tailored WORK filesystem file systems for improved IO throughput for of IO intense job workloads.
...
Default Lustre (WORK)
This WORK is the default shared Filesystem file system for all jobs and can be accessed using the $WORK
variable. WORK consists WORK is accessible for all users and consists of 8 Metadata Targets (MDT's) with NVMe SSDs and 28 Object Storage Targets (OST's) on Lise and 96 OST's on Emmy. Both using rotational using classical hard drives.
Access: $WORK
Size: 8 PiB quoted
Special File System Types
Lustre with striping (WORK)
Some workloads will benefit of striping. Files will be split transparently between a number of 2 up to 28 OSTs on Lise and up to 96 OST's on Emmy.
Especially large shared file IO patterns will perform good using stripingbenefit from striping. Up to 28 OSTs on Lise can be used, recommended are up to 8 OSTs for Lise. We have preconfigured a progressive file layout (PFL), which sets an automatic striping based on the file size.
Access: create a directory with striping using "new directory in $WORK
and set lfs setstripe -c <stripsize> <dir>
"
Size: 8 PiB like $WORKWORK
Local SSDs
Some Compute Nodes have are installed with local SSD storage . Files up to 2 TB on Lise and 480 GB or 1TB (depending on the node) on Emmy.
Info |
---|
Data on local SSDs can not be shared across nodes and will be deleted after the job |
...
is finished. |
For single node unshared local IO this is the best performing filesystem file system to use.
Lise: SSD | Lise: CAS |
---|
Access | via |
partition: |
using | via |
using $LOCAL_TMPDIR
via queue medium96partition: using | ||
Type and size | Intel NVMe SSD DC P4511 (2 TB) | Intel NVMe SSD DC P4511 (2 TB) using Intel Optane SSD DC P4801X (200 GB) as |
write-trough cache |
FastIO
...
WORK is extended with 4 additional OST's using NVMe SSDs integrated into WORK.
Access: ask support@hlrn.de for access
Size: 55 TiB - quoted
IME - Emmy only
---
Access: IME Burst Buffer, File System Cache
Size:
...
to accelerate heavy (random) IO-demands. To accelerate specific IO-demands further striping for up to these 4 OSTs is available.
Access:
create a new directory in $WORK
and set lfs setstripe -p flash <dir>
Size:
55 TiB - quoted
Finding the right File System
If your jobs have a significant IO part we recommend asking your consultant via support@hlrnsupport@nhr.zib.de to recommend the right filesystem file system for you.
Local IO
If you have a significant amount of node-local IO which is not needed to be accessed after job end and will be smaller than 2 TB on Lise and 400 GB on Emmy we recommend using $LOCAL_TMPDIR. Depending on your IO pattern this may accelerate IO to up to 100%.
...
Global IO is defined as shared IO which will be able to be accessed from multiple nodes at the same time and will be persistent after job end.
Especially random IO on small files will be accelerated up to 200% using FastIO on Lise or IME on Emmy
INTERNAL
Recommendation Matrix:
The maximum performance gain on IO versus $WORK is mentioned in brackets.
...
FastIO stripe=4 (+200%) or
$WORK stripe=4-8 (+200%)
...
FastIO stripe=4 (+80%)
$WORK stripe=4-8 (+70%)
...
FastIO stripe=4 (+120%)
$WORK stripe=MAX (+90%)
...
FastIO stripe=4 (+200%) or
$WORK stripe=MAX (+150%)
...
FastIO stripe=4 (+100%)
FastIO (+50%)
Vorschlag Ankündigung
...