Supporting in-place operations

For a number of use cases in TVM, it would be valuable to support in-place operators. One such operator would be strided_set, which currently takes two tensors, a primary tensor and a subtensor, and writes the subtensor into the primary tensor at a particular offset. It doesn’t do this in-place though, so in practice a new third tensor is created.

Were an in-place strided_set to exist, this would allow for some useful patterns. For example, a zero-copy concatenate where the necessary tensors are directly written into the correct part of the larger concat tensor.

Has anyone done any work on in-place operators? It’s not obvious to me what would need to be done to support them in TVM, so I’d be interested to find out if anyone has any ideas.


I was looking for something like this a couple of months back, but to avail.

It would be useful to have, I’m just unsure what changes would be needed. In a sense we have in-place operations when we fuse conv2d+relu layers (afaik), since we apply the ReLU on the accumulated value when it is ready.

Doing this requires a specialised pass (though I haven’t read the code for it). One could in-principle do something similar with your use-case. But it’s more interesting to consider what a general solution would look like, that could be easily used at the Python te.compute expression level.

I was wondering if there is strong need to in-place mutation. Introducing in-place operator is a bit troublesome and the gain is usually little, because most of those operations can be inlined. For example, in the conv2d+relu usecase, we can fuse relu into conv2d.