One thing I’ve thought about is asynchronous execution support in Relax. I don’t know if this is already planned as part of either Heterogenous execution or DistIR work, but just wanted to mention it in the discussion.
Even though we have async support in TIR, async support at the graph level could open up a lot of optimization opportunities, but it would also of course need to be planned out properly.