@popojames right now we only optimize at the “operator” level (post-operator fusion). it’s possible as we begin expanding optimization towards the subgraph level, we’ll need to incorporate some way of accounting for memory copy time. however, as @tkonolige mentioned, this is somewhat difficult to capture.
1 Like