You find - architecture specific - compilation advices, and MPI and OpenMP recipes here:
Workflow CPU CLX and Examples and Recipes (mostly CLX related)
Workflow CPU Genoa and Examples CPU Genoa
CUDA and OpenMP for GPU A100 (also includes MPI)
PVC GPU Programming and PVC MPI Usage