Hello,
when I want to integrate an accelerator, is there a way to automatically determine tiling? My accelerator has an on-chip memory and I want to maximize utilization by splitting the problem into tiles that fit in there.
I found define_split
and tile
but I am not sure which one is better suited for my use-case. Also, I am not exactly sure how they are different because I thought that define_split
is the way to handle tiling.