thanks for the writeup @wrongtest! a couple points I am more curious about:
could you say more here? is this a Relay-level thing or a TIR thing? presuming you’ve implemented this as a pass, how do you plan to ensure that the Relay-level pass makes the same scheduling decision as the TIR pass?
it seems like this could either be integrated into ci-cpu
or as a separate ci-
image, so long as the binaries are publicly available. do you have an estimate of the size of the docker image? also, just for my curiosity, would you be able to share a rough timeline of when you’d like to land this?