If you are using relay.build()
→ graph_executor.GraphModule
path, the point I remember is that it should pass a multi-target dict into target
argument of build and pass a device list into GraphModule like
lib = relay.build(relay_mod, target={"cpu": "llvm", "gpu": "cuda"}, params=params)
m = graph_executor.GraphModule(lib["default"](tvm.cpu(), tvm.gpu()))