Versionen im Vergleich

Schlüssel

  • Diese Zeile wurde hinzugefügt.
  • Diese Zeile wurde entfernt.
  • Formatierung wurde geändert.

PVC GPU Programming and Usage

Programs for Intel GPUs can be written using open industry and cross-platform standards, like:

  1. OpenMP offloading in C/C++ and Fortran, i.e. with the help of compiler directives and a supporting compiler, e.g. Intel’s icpx (C/C++ compiler) and ifx (Fortran compiler) which is available in intel/... environment modules.
    Intel provides an OpenMP Offloading Tuning Guide.

  2. SYCL for C++, i.e. in a data-parallel and explicit fashion (similar . Similar to CUDA ). Different to OpenMP, SYCL provides for more control over the code, data movements, allocations and so forth that are actually executed on the GPU. The SYCL implementation for Intel GPUs is provided by oneAPI. On the PVC partition, SYCL code can be compiled with the icpx compiler from the intel environment environment module(s). Migration of existing CUDA codes can be assisted using the DPC++ Compatibility Tool. The dpct binary is available via the intel/... environment modules. Note that SYCL codes can also be executed on Nvidia (and AMD) GPUs. but open and cross-platform. See SYCL for more details and code examples.

Please refer to the oneAPI Programming Guide for further details about programming. Contact NHR@ZIB support to get assistance on getting your application run on Intel GPUs

...

  • xpu-smi (available without any environment module being loaded) lists the available GPUs from the drivers and selected metrics/properties.

  • sycl-ls [--verbose] (available in intel/... environment modules) shows GPUs and their properties as available to applications.

  • nvtop (available in nvtop environment module) shows GPU compute and memory usage.

MPI Usage

For the PVC partition, Intel MPI is the preferred GPU-aware MPI implementation. Load an impi environment module, to make it available.

To enable GPU support, set the environment variable I_MPI_OFFLOAD to "1" (in your jobscript). In case you make use of GPUs on multiple nodes, it is strongly recommended to use the psm3 libfabric provider (FI_PROVIDER=psm3)

Depending on your application’s needs set I_MPI_OFFLOAD_CELL to either tile or device to assign each MPI rank either a tile or the whole GPU device.

It is recommended to check the pinning by setting I_MPI_DEBUG to (at least) 3 and I_MPI_OFFLOAD_PRINT_TOPOLOGY to 1.

Refer to the Intel MPI documentation on GPU support for further information.

Example Job Script:

...

languagebash

...

  • usage

...

  • .

...