Modifying relay graphs to use multiple devices?

I’m wondering about e.g. partitioning a graph across multiple gpus, either homogeneously (batch splitting) or heterogeneously (different parts of the network on different devices). It looks like currently the VM can’t actually manage memory across multiple devices:

Although I believe Relay memory management operations have attributes for device id, it looks like they aren’t being used right now.

Wondering if there are any plans for this / if other people could use this functionality.

Perhaps this PR would contain a starting point for heterogeneous use case on the Relay VM: