Which Compiler pass embeds "device_copy" calls between modules

hi ,
I have a question regarding Runtime module execution wrt to memory managment. Basically if there is a scenario where :

Module1 (CPU)  -----> Module2 (GPU)  -------> Module3 (CPU)

are getting executed , lets say i have done :

with tvm.transform.PassContext(opt_level=2):
        graph, lib, params = relay.build(mod, target="rocm" , target_host = "llvm", params=None)

and then ran the runtime module using:

lib.export_library("libx.so", fcompile=False)
    lib = runtime.load_module("libx.so")
    rt_mod = graph_runtime.create(graph, lib ,tvm.device("rocm" , 0))
    # rt_mod = debug_runtime.create(graph, lib ,tvm.device("rocm" , 0))
    execute_rt_mod(rt_mod)

, what i need to understand is where exactly the device_copy calls are embedded between Module1 (CPU) and module2 (GPU) and similary between Module2(GPU) and module3(CPU). thanks

Hi,

I’m not that familiar with TVM, but I just looked for it.

I think “CopyDataFromTo” actually invokes data copy.

In AOT/Graph executor, "CopyDataFromTo"s are called if name of the function is “__copy”

function “__copy” is generated by codegen if DeviceCopyProps are defined

It seems that DeviceCopyProps are automatically defined if it is device_copy op

or it can be injected through relay PlanDevices transform pass

and PlanDevices transform pass seems that it is one of the default pass in Relay optimze

Hope this reply helps

1 Like

hi @gangmul12 thanks for your reply, its helpful. also after brief code review , i fugred that PlanDevices the lowest pass which does device planning and inserts device_copy calls in the graph. In detail , Check(config) pass is responsible for creating a new copy node and inserting it .