[PyTorch] dyn.strided_slice loses shape information

jsheng-jian · July 29, 2021, 6:42am

Hello everyone I have a question about compiling Pytorch 1.9 retinanet_resnet50_fpn model, more specific while trying to compile this line (github.com/pytorch/vision/blob/v0.10.0/torchvision/models/detection/_utils.py#L205)

Traced jit graph:

aten::slice: Tensor slice(const Tensor& self, int64_t dim, int64_t start, int64_t end, int64_t step)

%6064 : int = prim::Constant[value=0](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6065 : int = prim::Constant[value=0](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6066 : int = prim::Constant[value=9223372036854775807](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6067 : int = prim::Constan[value=1](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6068 : Float(0, 4, strides=(4, 1], requires_grad=0, device=cpu) = aten::slice(%rel_codes.1, %6064, %6065, %6066, %6067), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6069 : int = prim::Constant[value=1](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6070 : int = prim::Constant[value=0](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6071 : int = prim::Constant[value=9223372036854775807](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6072 : int = prim::Constant[value=4](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6073 : Float(0, 1, strides=[4, 4], requires_grad=0, device=cpu) = aten::slice(%6068, %6069, %6070, %6071, %6072), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0

My understanding here we get a N by 1 dx vector and later stacked together in _utils.py#L223

while the relay graph generates this

%1804 = adv_index(%1802) /* ty=Tensor[(?, 4), float32] /;
%1844 = where(%1839, %1835, %1838) / ty=Tensor[(2), int32] /;
%1845 = cast(%1843, dtype=“int64”) / ty=Tensor[(2), int64] /;
%1846 = dyn.strided_slice(%1804, %1844, %1845, meta[relay.Constant][88] / ty=Tensor[(2), int32] /, begin=None, end=None, strides=None, axes=None) / ty=Tensor[(?, ?), float32] */;
Later the missing dimension causes an error while unbinding along static dimension in this line (github.com/pytorch/vision/blob/v0.10.0/torchvision/models/detection/transform.py#L287) \

The error message is this:
in unbind, ishapes: (?, ?)
Traceback (most recent call last):
File “retinanet_test.py”, line 110, in
retina_net_lab()
File “retinanet_test.py”, line 74, in retina_net_lab
mod, params = relay.frontend.from_pytorch(script_module, shape_list)
File “/home/ubuntu/neo-ai/tvm/python/tvm/relay/frontend/pytorch.py”, line 3363, in from_pytorch
ret = converter.convert_operators(_get_operator_nodes(graph.nodes()), outputs, ret_name)[0]
File “/home/ubuntu/neo-ai/tvm/python/tvm/relay/frontend/pytorch.py”, line 2785, in convert_operators
inputs, _get_input_types(op_node, outputs, default_dtype=self.default_dtype)
File “/home/ubuntu/neo-ai/tvm/python/tvm/relay/frontend/pytorch.py”, line 2142, in unbind
res_split = _op.split(data, selections, dim)
File “/home/ubuntu/neo-ai/tvm/python/tvm/relay/op/transform.py”, line 908, in split
ret_size = len(indices_or_sections) + 1
TypeError: object of type ‘Any’ has no len() \

I wonder whether the dyn.strided_slice behavior is expected and whether there is any workaround to enable this model? @masahi Thanks!

masahi · July 29, 2021, 8:07am

Yes your observation is correct. We cannot support PT retinanet for two reasons:

Our dynamic strided slice doesn’t work great when input shape is partially static/dynamic. It makes output shape dynamic in all dimensions, even if slicing is only in a certain dimension (batch axis etc). Unfortunately this is a limitation of how runtime shapes are represented in Relay: Runtime shapes are fully dynamic in all dimensions.
Our split op doesn’t support dynamic sections. This is the error you got. Even if the first point cannot be overcome, in principle we can support retinanet if our split op supports dynamic sections (although performance would be suboptimal than the ideal case). This would be an easier solution in the short term.

trevor-m · July 29, 2021, 11:08pm

Hi Masahi, I think dynamic split will be impossible, because the output tuple will have a unknown size. The size of a tuple must be static in Relay. However, it could be possible if the pytorch graph tells us the number of outputs and we know it to be static.

byronyi · August 9, 2021, 2:42am

Is this somewhat a design choice? I am asking as advanced indexing, strided slice, etc. is heavily used in detection models and dropping support on not a single model but a family of models seem pretty opinionated to me.

masahi · August 9, 2021, 6:58am

I wouldn’t say that. The decision to represent run time shapes as fully dynamic was a design choice, and the lack of support for partially static/dynamic runtime shape is the consequence/limitation of the design choice.

I’m fully aware of the limitation of dyn.strides_slice, so I did some work to preserve static shape as much as possible in partially dynamic slicing . See the following PRs for details:

github.com/apache/tvm

[Relay, TOPI] Support dynamic slicing on first few axes, keeping the rest static

main ← masahi:dyn-slice-few-axes

opened 10:41AM - 18 May 21 UTC

masahi

+59 -38

This PR relaxes the constraint that `begin`, `end` and `strides` params of dynam…ic strided slice to have the same length as input rank, and enable slicing on only first few axes (in particular, slicing along batch dim only). For example, the output shape of ONNX NMS is `(num_detection, 3)`. However, when translated to relay, it becomes `(?, ?)`, since current dynamic stride slice always slices on all axes, even if some axes are conceptually static. This posed a problem when `GatherND` is applied to the output of ONNX NMS, since it requires the second dimension to be static: https://github.com/apache/tvm/blob/c999a840cb5579c493f5b5e7f20bc619260dad08/src/relay/op/tensor/transform.cc#L3348-L3349 ready for review @mbrookhart @jwfromm @kevinthesun @yongwww

github.com/apache/tvm

[Relay, TOPI] Refactor strided_slice and add axes argument

main ← masahi:slice-axes

opened 11:33PM - 31 May 21 UTC

masahi

+790 -357

This PR has two purposes: 1. Add `axes` argument to static param `strided_slice…` 2. Refactor `strided_slice` related code so that adding `axes` argument would not increase clutter. Also I did further refactoring to remove duplicated output shape calculation between topi/relay type rel (left as [TODO](https://github.com/apache/tvm/blob/main/include/tvm/topi/transform.h#L641-L642)) and minimize the number of `te::compute` for strided slice (currently there are 4 of them). For 1, currently slicing along the **second** axis with input shape `(?, 3)`, begin `[0]`, end `[1]` calls the dynamic parameter variant of relay/topi `strided_slice` with `begin = (0, 0)` and `end = (?, ?)`, which results in the output shape `(?, ?)` rather than desired `(?, 1)`. This causes a problem when the output is fed to `gather_nd` which requires a certain dimension to be static. The root problem is that our `strided_slice` op does not have a notion of `axes` and always tries to slice along all axes. So when some of the dimensions are dynamic, we call dynamic parameter variant of `strided_slice` which makes all input params and the output shape dynamic, even if we only need to slice along static dimensions with static begin/end. My solution adds `axes` argument to static parameter variant of `strided_slice`so that slicing would not touch dynamic dimensions and preserve static dimensions along provided `axes`. For 2, we accumulated many patches for `strided_slice` over the last few years and I found the current implementation extremely messy and hard to maintain. In order not to make the situation worse by introducing `axes` argument, I decided to roll major refactoring of our `strided_slice` related code. First, the output shape calculation logic is duplicated between topi and relay type rel: * https://github.com/apache/tvm/blob/main/include/tvm/topi/transform.h#L651-L714 * https://github.com/apache/tvm/blob/main/src/relay/op/tensor/transform.cc#L2451-L2534 This has been fixed by introducing [StridedSliceOutputShape function](https://github.com/apache/tvm/blob/64c1bbf27607de08620c942d41592bae71c0a667/include/tvm/topi/transform.h#L655) in topi and use it from both topi/relay. Second, currently there are 4 `te::compute` for `strided_slice`, each with slightly different expectation on its input and input param canonicalization. * https://github.com/apache/tvm/blob/main/include/tvm/topi/transform.h#L716-L726 * https://github.com/apache/tvm/blob/main/include/tvm/topi/transform.h#L628-L636 * https://github.com/apache/tvm/blob/main/include/tvm/topi/transform.h#L579-L593 * https://github.com/apache/tvm/blob/main/src/relay/op/tensor/transform.cc#L2710-L2720 I examined each of them in detail and decided that we only need two `te::compute` to support all cases: * A static parameter variant, where all input parameters are `Array<Integer>`. Input shape can be partially dynamic/static. https://github.com/apache/tvm/blob/64c1bbf27607de08620c942d41592bae71c0a667/include/tvm/topi/transform.h#L700-L712 * A partially dynamic/static parameter variant, where all input parameters are `Array<PrimExpr>`. https://github.com/apache/tvm/blob/64c1bbf27607de08620c942d41592bae71c0a667/include/tvm/topi/transform.h#L594-L608 In both cases, static input dimensions are preserved as much as possible. The `axes` is added only to the first variant, according to my use case. please review @tqchen @mbrookhart @kevinthesun @yongwww @comaniac

In particular, axes argument added in 8165 can be used to preserve shape information in batch-only slicing case, for example. I didn’t update the PT frontend to make use of this, so I’m going to take a look at Retinanet support again to see if we can import this model now.

masahi · August 9, 2021, 8:45am

@jsheng-jian Ok I was able to remove all dyn.strided_slice, but shape information is also lost by dyn.reshape, like this:

  %1170 = stack(%1163, axis=2) /* ty=Tensor[(?, 2, 2), float32] */;
  %1548 = dyn.reshape(%1170, %1171, newshape=[]) /* ty=Tensor[(?, ?), float32] */;

%1548 above should have a shape (?, 4), but dyn.reshape makes everything dynamic. So we just want to reshape (?, 2, 2) into (?, 4), but we cannot do that and we don’t have a workaround for this problem.

In PT, this reshape corresponds to this line vision/boxes.py at be4ff9a37831c5bee9392bb403a1228131167972 · pytorch/vision · GitHub

jlamperez · August 17, 2021, 7:51pm

Hi,

I think that I am having a similar behavior when trying to compile YOLOX-ONNX models in TVM.

%40 = dyn.strided_slice(%0, %8, %9, meta[relay.Constant][7] /* ty=Tensor[(4), int64] */, 
begin=None, end=None, strides=None, axes=None) /* ty=Tensor[(?, ?, ?, ?), float32] */;

Can you support YOLOX?

Thanks!