Hi,
We are currently tuning some models on g4 instance (T4 GPU) and noticed that the following setting is able to boost the performance (from @trevor-m ):
from tvm.autotvm.measure.measure_methods import set_cuda_target_arch
set_cuda_target_arch('sm_75')
It seems always better to enable this by default, but we still need to set it manually, so I have some questions:
-
Why don’t we enable it by default?
-
If we set this flag for AutoTVM, how should we reproduce the result when building the model after tuning?
Thanks in advance.