[RFC] 'Cascade' Scheduling

mbaret · October 8, 2020, 4:00pm

Thanks for the feedback Tiling the output computations + compute_at is actually exactly what I’ve been doing to prototype this - and you’re right that for a sufficiently large tile the recompute isn’t particularly bad. I think the rolling buffers aren’t immediately essential, but they would be a very beneficial future optimization.

In our testing/prototyping we have found profitable cascades of 5+ ops, particularly in both mobilenet-type architectures and super-resolution networks. Determining whether continuing a cascade is profitable would be one of the jobs of the cascading algorithm.

My major concern integrating this is that convolution-type operations are always on their own in primitive functions. For my experiments I’m currently lowering the whole graph to a single TE but this will not work alongside the current TOPI integration which expects ‘master ops’ to determine the schedule. In essence I would like to do a hierarchical scheduling, first of the cascades and the second of the ops themselves.