Does tvm fulsh the cache between the fused groups

Hi all,

How does the tvm flush cache between any fused groups execution?

I ran a model with two variants where the tensor passed between the two fused groups is int8 and int16. But while profiling both have the same execution time. I think there must be some speedup during tensor load/store from int16 to int8.

thanks.

I don’t think inserting stop_fusion op flush the cache between the ops.

I would like to know how the cache is organized in tvm, and which source file drives that.

thanks.

It doesn’t clear the cache when executing the model. During auto tuning on cpu, there is an option to flush the cache. https://github.com/apache/tvm/blob/main/python/tvm/meta_schedule/testing/tune_onnx.py#L155