How to limit VGPRs
and SGPRs
for ROCm?
t-vi
November 14, 2019, 10:38pm
2
It’s been a while, but when you set the maximum workgroup size it will cause LLVM to limit the number of registers so that the given maximum workgroup size can be accomodated. This PR sets this for the AMDGPU codegen:
master
← t-vi:rocm_workgroup_size
opened 09:06PM - 14 Nov 19 UTC
When we did not set the workgroup size, LLVM will use too many registers for ker… nel launches with many threads. This resulted in "invalid ISA" errors. Here we set the maximum workgroup size to the maximum threads per block from the device API.
One might later look into allowing configurations with fewer threads at runtime to use more registers.
@t-vi Thanks, so what is the user interface in TVM to limit them?
t-vi
January 1, 2020, 2:04pm
4
Last time I looked it was not configurable but the codegen will query the device properties.
Best regards
Thomas