Hello everyone I have a question about compiling Pytorch 1.9 retinanet_resnet50_fpn model, more specific while trying to compile this line (github.com/pytorch/vision/blob/v0.10.0/torchvision/models/detection/_utils.py#L205)
Traced jit graph:
aten::slice: Tensor slice(const Tensor& self, int64_t dim, int64_t start, int64_t end, int64_t step)
%6064 : int = prim::Constant[value=0](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6065 : int = prim::Constant[value=0](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6066 : int = prim::Constant[value=9223372036854775807](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6067 : int = prim::Constan[value=1](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6068 : Float(0, 4, strides=(4, 1], requires_grad=0, device=cpu) = aten::slice(%rel_codes.1, %6064, %6065, %6066, %6067), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6069 : int = prim::Constant[value=1](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6070 : int = prim::Constant[value=0](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6071 : int = prim::Constant[value=9223372036854775807](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6072 : int = prim::Constant[value=4](), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
%6073 : Float(0, 1, strides=[4, 4], requires_grad=0, device=cpu) = aten::slice(%6068, %6069, %6070, %6071, %6072), scope: __module.model # /home/ubuntu/.local/lib/python3.6/site-packages/torchvision/models/detection/_utils.py:205:0
My understanding here we get a N by 1 dx vector and later stacked together in _utils.py#L223
while the relay graph generates this
%1804 = adv_index(%1802) /* ty=Tensor[(?, 4), float32] /;
%1844 = where(%1839, %1835, %1838) / ty=Tensor[(2), int32] /;
%1845 = cast(%1843, dtype=“int64”) / ty=Tensor[(2), int64] /;
%1846 = dyn.strided_slice(%1804, %1844, %1845, meta[relay.Constant][88] / ty=Tensor[(2), int32] /, begin=None, end=None, strides=None, axes=None) / ty=Tensor[(?, ?), float32] */;
Later the missing dimension causes an error while unbinding along static dimension in this line
(github.com/pytorch/vision/blob/v0.10.0/torchvision/models/detection/transform.py#L287) \
The error message is this:
in unbind, ishapes: (?, ?)
Traceback (most recent call last):
File “retinanet_test.py”, line 110, in
retina_net_lab()
File “retinanet_test.py”, line 74, in retina_net_lab
mod, params = relay.frontend.from_pytorch(script_module, shape_list)
File “/home/ubuntu/neo-ai/tvm/python/tvm/relay/frontend/pytorch.py”, line 3363, in from_pytorch
ret = converter.convert_operators(_get_operator_nodes(graph.nodes()), outputs, ret_name)[0]
File “/home/ubuntu/neo-ai/tvm/python/tvm/relay/frontend/pytorch.py”, line 2785, in convert_operators
inputs, _get_input_types(op_node, outputs, default_dtype=self.default_dtype)
File “/home/ubuntu/neo-ai/tvm/python/tvm/relay/frontend/pytorch.py”, line 2142, in unbind
res_split = _op.split(data, selections, dim)
File “/home/ubuntu/neo-ai/tvm/python/tvm/relay/op/transform.py”, line 908, in split
ret_size = len(indices_or_sections) + 1
TypeError: object of type ‘Any’ has no len() \
I wonder whether the dyn.strided_slice behavior is expected and whether there is any workaround to enable this model? @masahi Thanks!