I would advise you to take a look at the cache_read and cache_write methods of the schedules. In my particular case (I don’t know if this is exactly what you are looking for) I use this 2 methods to generate read and write stages, and then tag this stages with specific pragmas. Then, I have 2 separate passes that replace the loops tagged with this pragmas with my own load and store macros.