Indirect Tensors and Efficient Padding

Dear All,

Pad operation is often implemented by element-by-element copying from input tensor to output tensor.

Targets that support indirect tensors on fast memory (eg. tightly coupled memory) can do better. For instance, in hexagon, with support for discontiguous tensor in place, we can have efficient implementation of pad operations by specifying an offset and range in an efficiently constructed over-padded tensor.

Eg. consider this 2-level indirect allocation where an element of a logical 2D tensor is specified by a pair: an offset into a pointer table, and an offset into a 16-element (4X4) chunk. In the figure below, the yellow grids represent a 16X8 tensor (logical shape). Assume that the fast tightly coupled memory can hold 36 chunks. The pointer table offsets for the 16X8 tensor are: (5, 6, 9, 10, 13, 14, 17, 18).

Let us say we want to pad this tensor by 3 elements on all 4 sides. We can have an efficiently constructed over-padded tensor, whose pointer table offsets are: (0, 0, 0, 0, 0, 5, 6, 0, 0, 9, 10, 0, 0, 13, 14, 0, 0, 17, 18, 0, 0, 0, 0, 0) - assuming that chunk 0’s all 16 elements are filled with the pad value; the over-padded tensor’s logical dimensions are (24X16), the actual data in the correctly padded tensor starts at offset (1,1) of the over-padded tensor, and the logical shape of the correctly padded tensor is (22X14).

A natural way to model this is to attach this information (offset and bound) all tensors, and make it visible at relax/TIR level. “Correctly padded tensors” are special cases of “over-padded tensors” with offset (0,0) and bound equaling their logical shape.

But attaching this to tensor means all operators/primfuncs need to understand offsets and bounds.

We are experimenting with attaching offset/bound attributes to over-padded tensor arguments of a primfunc, and then transforming primfuncs which consume over-padded tensors so that over-padded tensor accesses are displaced by offsets at each dimension.

Did other targets encounter a similar situation? Any inputs on dealing with this will be very helpful.

Thank you!