Supporting in-place operations

For a number of use cases in TVM, it would be valuable to support in-place operators. One such operator would be strided_set, which currently takes two tensors, a primary tensor and a subtensor, and writes the subtensor into the primary tensor at a particular offset. It doesn’t do this in-place though, so in practice a new third tensor is created.

Were an in-place strided_set to exist, this would allow for some useful patterns. For example, a zero-copy concatenate where the necessary tensors are directly written into the correct part of the larger concat tensor.

Has anyone done any work on in-place operators? It’s not obvious to me what would need to be done to support them in TVM, so I’d be interested to find out if anyone has any ideas.

Thanks

I was looking for something like this a couple of months back, but to avail.

It would be useful to have, I’m just unsure what changes would be needed. In a sense we have in-place operations when we fuse conv2d+relu layers (afaik), since we apply the ReLU on the accumulated value when it is ready.

Doing this requires a specialised pass (though I haven’t read the code for it). One could in-principle do something similar with your use-case. But it’s more interesting to consider what a general solution would look like, that could be easily used at the Python te.compute expression level.

I was wondering if there is strong need to in-place mutation. Introducing in-place operator is a bit troublesome and the gain is usually little, because most of those operations can be inlined. For example, in the conv2d+relu usecase, we can fuse relu into conv2d.