Asynchronous RPC in TVM

Hi,

I’m trying to searching schedules on device through RPC. However, network communication becomes bottleneck of this process. If I directly compile and run module on target machine, it takes about 1s to complete, while with RPC it takes 15s to complete the round trip. The module and parameters are small. Is there any way to alleviate this issue, such as asynchronous RPC?

If the module and parameters are small, are you spending bandwidth retrieving the computed results?

The time for uploading and retrieving is similar. The major issue for me is the network. If I use an EC2 machine as host and another EC2 machine as target, the total latency of RPC is acceptable.

Thank you for help!

Do you have an idea of what the relative cost of:
|<-compile for target->||<-upload binary,data to target->||<-time evaluator call->||<-retrieve result->| are here?

When tuning we often skip copying the input data to the target/reading results back to reduce network overhead if we are confident that the search space doesn’t have many incorrect configurations.