How to get host code or blockIdx, threadIdx of GPU op

Currently I’m trying to integrate TVM into TensorFlow by custom op, but has met many obstacles. I have considered two integrate approach: 1, export_library from python script and load from cpp, just like examples from tvm/apps/howto_deploy/ or tftvm project. 2, export cuda code and compile it by nvcc. However, for 1, there are c++ ABI Incompatible problems: tensorflow is ABI=0, while TVM is ABI=1, even I can compile TVM with ABI=0, but it dependent LLVM is ABI=1, thus I should also recompile LLVM … Another solution may be recompile TF with ABI=1, but then I need to maintain and release TF by myself. For 2, do some guys know hot to export host code? since TVM can already export cuda code. Based on previous post, it seems TVM can’t export host code? Then is it possible to get parameters include blockIdx, threadIdx thus I can manually create host code by simply append the blockIdx, threadIdx parameters and cuda kernel arguments? Or do you know some other better solutions? Thank you very much.

1 Like