when I want to integrate an accelerator, is there a way to automatically determine tiling? My accelerator has an on-chip memory and I want to maximize utilization by splitting the problem into tiles that fit in there.
tile but I am not sure which one is better suited for my use-case. Also, I am not exactly sure how they are different because I thought that
define_split is the way to handle tiling.