How to tile buffers to fit into on-chip memory?

Hi, i meet the same problem and don’t know how to make local buffer size smaller. Do you have any update now?