Atlassian uses cookies to improve your browsing experience, perform analytics and research, and conduct advertising. Accept all cookies to indicate that you agree to our use of cookies on your device. Atlassian cookies and tracking notice, (opens new window)
User Manual

User Manual
Results will update as you type.
  • Application Guide
  • Status of System
  • Usage Guide
  • Compute partitions
  • Software
  • FAQ
    • Floating point exception with Intel MPI 2019.x using one task per node
    • How long can I access my project data?
    • I cannot contact external servers/services
    • I lost access to my ssh key.
    • I lost my password for Portal NHR@ZIB.
    • I lost my PIN to reset my Portal NHR@ZIB password.
    • INTEL-MPI version 19 slower than INTEL-MPI version 18
    • Known Warnings
    • Memory Limits on Login Nodes
    • MPI executable dies with error: "hfi_userinit: mmap of status page ..."
    • MPI Jobs with more than 40 (96) tasks per node failing
    • Search in this wiki fails
    • Slow execution of nccopy from netcdf 4.6.3
    • SLURM does not recognize job script
    • Too many open files
    • Unspecific error messages when reading huge input files
  • NHR Community
  • Contact

    You‘re viewing this with anonymous access, so some content might be blocked.
    /
    Too many open files

    Too many open files

    Feb. 16, 2022


    When using srun --propagate

    A process started with "srun" using the "–propagate" option fails with "Too many open files". Since Slurm upgrade to version 21.

    Slurm version 21 will run the compute process with a hard open file limit (RLIMIT_NOFILE) of only 4096.
    See also https://github.com/SchedMD/slurm/commit/18b2f4fff3f8fd5773ab14ec631bbd5f2995fa6e


    Solution

    Add NOFILE to --propagate. See also man 1 srun.

    Example:

    $ srun --propagate=STACK,NOFILE ...

    instead of

    $ srun --propagate=STACK ...







    , multiple selections available, Use left or right arrow keys to navigate selected items
    kb-troubleshooting-article
    {"serverDuration": 10, "requestCorrelationId": "f76f47ac73324fed84d54399be90c67f"}