With the PR https://github.com/dmlc/tvm/pull/4056 being merged, we should have better float16 support for cuda. It works well under NVCC compiler but it seems that having some problems under NVRTC. The compile error is here:
TVMError: Check failed: compile_res == NVRTC_SUCCESS (6 vs. 0) :
default_program(12): error: class "__half_raw" has no suitable copy constructor
default_program(12): error: class "__half_raw" has no suitable copy constructor
default_program(16): error: class "__half_raw" has no suitable copy constructor
default_program(16): error: class "__half_raw" has no suitable copy constructor
default_program(20): error: class "__half_raw" has no suitable copy constructor
default_program(20): error: class "__half_raw" has no suitable copy constructor
6 errors detected in the compilation of "default_program".
@xyzhou Could you please take a look at it? Thank you!