Are we aware of what the asymptotic complexity of ScheduleOps is in terms of the number of TE compute ops? I was hoping it might be O(N) but some experiments show the execution time increasing ~O(N^2) with this becoming very problematic for larger TE graphs. MakeBoundCheck
appears to be one of the offenders here but I think there are others too. I’d also be interested as to whether this is something that might be improved with TensorIR.
Thanks