Conflict free shared memory permutation in tensorir

Seems this is impacted by recent commits to TransformLayout. There are some checks here https://github.com/apache/tvm/blob/main/src/tir/schedule/primitive/layout_transformation.cc#L1093-L1096 that rely on NonSurjectiveInverse analysis, which doesn’t cover cases. Operators used here (xor) are not supported by affine analysis but is allowed in TransformLayout. The previous behavior is to allow new buffer being padded, but the padding region will not be accessed by the transformed program. Maybe we can disable such check if pad_value is not provided. @Lunderberg may have suggestions

1 Like