HIP Target and Platform
Introduce HIP as a dedicated code generation platform.
pystencils.codegen
- Add
Target.HIP
- Change
Target.GPU
to aliasTarget.CurrentGPU
- Auto-detect
Target.CUDA
orTarget.HIP
on GPU systems when usingTarget.CurrentGPU
pystencils.backend
- Move all common functionality of CUDA and HIP to
GenericGpu
platform class - Introduce
HipPlatform
to inherit fromGenericGpu
pystencils.jit
- Adapt the
CupyJit
to only compile CUDA kernels on NVidia platforms, and only HIP kernels on ROCm platforms- While HIP can technically act as a shallow wrapper around CUDA on NVidia systems, this does not make sense here,
since a) we're gaining portability through code generation anyway, b)
hipcc
will just callnvcc
on Nvidia systems anyway, and c) cupy needs to be built against the entire ROCm software stack to use HIP.
- While HIP can technically act as a shallow wrapper around CUDA on NVidia systems, this does not make sense here,
since a) we're gaining portability through code generation anyway, b)
Code Adaptations
- Use
Target.CurrentGPU
throughout the test suite where it is required - Use
Target.is_gpu()
to detect GPU targets in all places where previously targets were checked only againstTarget.CUDA
Documentation
- Adapt installation guide and GPU codegen guide to reflect the new target
- Adapt backend GPU codegen docs
- Extend contrib guide with info on GPU development
On CI Testing
Since cupy only supports the combinations NVidia+CUDA and ROCm+HIP, we cannot currently test HIP code generation in the CI since we don't have GitLab runners with AMD GPUs.
Rationale
Separate modelling of CUDA and HIP in the code generator will be necessary to capture architectural differences in !438. Also, it turns out to be very important for the new waLBerla code generator (see https://i10git.cs.fau.de/da15siwa/sfg-walberla); using the target is the easiest way to distinguish between CUDA and HIP for GPU codegen.
Edited by Frederik Hennig