Skip to content

Optimization for GPU block size determination

Richard Angersbach requested to merge rangersbach/cuda_blocksizes into v2.0-dev

This MR optimizes GPU block sizes such that these are always multiples of the hardware's warp (CUDA) or wavefront (HIP) size.

Summarized, this MR

  • removes BasicOption GpuOptions.omit_range_check
  • removes BasicOption GpuOptions.block_size
  • introduces BasicOption GpuOptions.warp_size and implements function for determining default values
  • introduces BasicOption assume_warp_aligned_block_size, ensuring the compiler that block sizes match with warp size
  • adds new GpuOptions to the data flow of GpuIndexing
  • adds algorithm for fitting block size according to iteration space and warp size
  • adds fit_block_size and trim_block_size member functions to DynamicBlockSizeLaunchConfiguration for computing block sizes based on a user-defined initial block size and the iteration space
  • for assumed alignment: rounds to multiples of warp size when iteration space is unknown to generation time
Edited by Richard Angersbach

Merge request reports