[SOLVED] Help with tensorization with loops surrounding the tensor block

Yes, however it brings branching in the loop which prevents tensorization from working. Maybe the following step should be decompose_padding, but I am still wondering how I could generate tensorized instructions for the cases at the boundaries.

Edit: I am marking this one as solved and opened a new post dedicated to the new question