With the PR https://github.com/dmlc/tvm/pull/4056 being merged, we should have better float16 support for cuda. It works well under NVCC compiler but it seems that having some problems under NVRTC. The compile error is here:
TVMError: Check failed: compile_res == NVRTC_SUCCESS (6 vs. 0) :
default_program(12): error: class "__half_raw" has no suitable copy constructor
default_program(12): error: class "__half_raw" has no suitable copy constructor
default_program(16): error: class "__half_raw" has no suitable copy constructor
default_program(16): error: class "__half_raw" has no suitable copy constructor
default_program(20): error: class "__half_raw" has no suitable copy constructor
default_program(20): error: class "__half_raw" has no suitable copy constructor
6 errors detected in the compilation of "default_program".
@xyzhou Could you please take a look at it? Thank you!
On the other part, this PR seems to fix some problems happened in Windows. I’m sorry that I do not really understand the purpose of that PR and also I have no windows computer to fix it. Maybe we should wait for the PR author’s response.