Is it possible to extend compiled CUDA kernel to support launch kernels with dynamic Block/Grid size

zzh_ontheway · September 30, 2022, 7:44am

Hi there, I am trying to launch different sets of launch params (e.g. grid/block size) based on TVM compiled CUDA kernel, however, directly doing it leads to error of results. Is it possible to support this kind of “elastic kernel” implementations? Thanks

zzh_ontheway · September 30, 2022, 7:45am

@masahi @junrushao @kparzysz @areusch @BruceDai003 @puddingfjz

yzh119 · September 30, 2022, 9:25pm

@junrushao We had a discussion on binding symbolic-extent loops with physical threads before, and there should be no technical blocking items.

I’ll try creating a PR to fix it.