How to do heterogeneous execution on cpu and gpu?

Hello,

I have read some posts from forum,but I still confused about that.

  1. If I want to using Relay to build a simple network and heterogeneous execution some Ops on gpu and others on cpu. There seem to be two different ways.
  • One is through relay.annotation.on_device, relay.device_copy and relay.transform.RewriteAnnotatedOps. After that, relay graph will rewrite, and I can do relay.build. But my TVM version is 0.8, it seems not working. Or Is my usage wrong? I’m not sure how to do in current version.
  • Another way is part of BYOC, but I just want to try heterogeneous execution on gpu and cpu. It doesn’t seem to be needed?
  1. I want to check that do heterogeneous execution what difference will be on the json. I have read some code about Jsonreader and graph_executor. I guess if i do heterogeneous execution,json will have some tvm_op which func_name is “__copy” to copy data between device,and device_index will denote every node should execution on which device. Is my guess correct?

I’m new to TVM, any help or suggestions are massively appreciated!

Hi~ Can this unittest case help you?

If you are using relay.build()graph_executor.GraphModule path, the point I remember is that it should pass a multi-target dict into target argument of build and pass a device list into GraphModule like

lib = relay.build(relay_mod, target={"cpu": "llvm", "gpu": "cuda"}, params=params)
m = graph_executor.GraphModule(lib["default"](tvm.cpu(), tvm.gpu()))
2 Likes