How to perform automatic tiling?

Hello,

when I want to integrate an accelerator, is there a way to automatically determine tiling? My accelerator has an on-chip memory and I want to maximize utilization by splitting the problem into tiles that fit in there.

I found define_split and tile but I am not sure which one is better suited for my use-case. Also, I am not exactly sure how they are different because I thought that define_split is the way to handle tiling.