Versionen im Vergleich

Schlüssel

  • Diese Zeile wurde hinzugefügt.
  • Diese Zeile wurde entfernt.
  • Formatierung wurde geändert.

...

Codeblock
cp repository/* input_area
sleep 20
mpirun ...
sleep 20

Alternatively, the tool nocacheImage Added serves as a workaround for this issue (thanks John):

Codeblock
nocache cp repository/* input_area
mpirun ...

Nach Stichwort filtern (Inhalt nach Stichwort)
showLabelsfalse
max5
spacesPUB
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel in ("files","invalid","format","file","huge") and type = "page" and space = "PUB"
labelshuge files invalid file format


Seiteneigenschaften
hiddentrue


Related issues




Problem

In a job that requires "staging" of new huge input files (8GB in 650 files) during runtime, the job fails with error messages like "invalid file format". Inspecting the files later, does not reveal any errors and the input files are sane

Codeblock
cp repository/* input_area
mpirun ...

It seems to be a lustre cache related problem, the startup of the parallel process is faster than lustre can sychronise itself on all nodes.

Solution

Add some delay after copying large file sets:

Codeblock
cp repository/* input_area
sleep 20
mpirun ...
sleep 20

Alternatively, the tool nocacheImage Added serves as a workaround for this issue (thanks John):

Codeblock
nocache cp repository/* input_area
mpirun ...

Related articles

Nach Stichwort filtern (Inhalt nach Stichwort)
showLabelsfalse
max5
spacesPUB
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel in ("files","invalid","format","file","huge") and type = "page" and space = "PUB"
labelshuge files invalid file format

...