[PyTorch] Returning ListConstruct is not handled in PyTorch parser

vamshidhar.dantu · July 23, 2020, 9:03pm

I have a graph which returns the output of ListConstruct. I get the following error when trying to convert to relay that.

Traceback (most recent call last):

....

  File "pytorch_to_relay.py", line 125, in compile_
    mod, params = relay.frontend.from_pytorch(trace_model, inp_shape)

  File "/tvm/python/tvm/relay/frontend/pytorch.py", line 2190, in from_pytorch
    outputs, ret_name, convert_map, prelude)

  File "/tvm/python/tvm/relay/frontend/pytorch.py", line 2078, in convert_operators
    elif operator == "prim::ListConstruct" and _should_construct_dynamic_list(op_node):

  File "/tvm/python/tvm/relay/frontend/pytorch.py", line 112, in _should_construct_dynamic_list
    if is_used_by_list_add(filter(lambda use: use.user.kind() != "prim::Loop", uses)):

  File "/tvm/python/tvm/relay/frontend/pytorch.py", line 85, in is_used_by_list_add
    output_type = _get_node_type(use.user)

  File "/tvm/python/tvm/relay/frontend/pytorch.py", line 1652, in _get_node_type
    assert node.outputsSize() == 1

AssertionError

the node is a “prim::Return” node whose output will always be 0. Is this an expected behavior when you return a List[Tensor]?

masahi · July 23, 2020, 10:04pm

Handling ListConstruct is tricky. In particular, returning ListConstruct as the output is not supported.

Since Relay cannot return a Python list, you shouldn’t expect to be able to get the python list as output. If the output list is truly a variable length, dynamic list, we can return Relay List VM object. This requires using VM runtime instead of graph runtime. I haven’t met this use case, so it is not implemented.

vamshidhar.dantu · July 23, 2020, 11:00pm

@masahi: Thanks for the response. So only supported return type when you have multiple outputs is a Tuple?

masahi · July 23, 2020, 11:21pm

Actually our test cases don’t cover multiple output case at all. For TupleConstruct we always return Relay Tuple (see below), so I hope it would work.

github.com

apache/incubator-tvm/blob/master/python/tvm/relay/frontend/pytorch.py#L2540-L2541


elif operator == "prim::TupleConstruct":
    outputs[node_name] = _expr.Tuple(inputs)

If you have a good use case for multiple output models, we can add tests for them.

vamshidhar.dantu · July 23, 2020, 11:23pm

I think huggingface bert-base-uncased returns multiple outputs? Like a Tuple()?

masahi · July 23, 2020, 11:37pm

ok, if the number of outputs is fixed, a tuple should be used.

The reason handling ListConstruct is tricky is because it is used for creating both “static” list like padding=[1, 1] and also a “dynamic” list that can be appended variable number of items. For the former case, we should pass the static list to relay ops directly, but the latter case requires creating a Relay List ADT. The tricky part is how to distinguish two cases.

vamshidhar.dantu · July 24, 2020, 12:10am

Cool. Thanks . I will try Tuples as the number of outputs are defined at the graph parsing time. Just out of curiosity, do you have an example of how to use the Relay VM Object (for dynamic list case in PT).

masahi · July 24, 2020, 12:21am

Our LSTM tests cover dynamic models.

github.com

apache/incubator-tvm/blob/master/tests/python/frontend/pytorch/lstm_test.py

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
""" Tests on torch lstm model conversion """
# originally from https://github.com/pytorch/pytorch/blob/master/benchmarks/fastrnns/custom_lstms.py
# described in https://pytorch.org/blog/optimizing-cuda-rnn-with-torchscript/
import numpy as np

This file has been truncated. show original

I realized that models there returns a tuple (like return torch.stack(outputs), state), so the multiple output case is actually tested.

Below is an example of how you can retrieve output tensors from Relay Tuple object.

github.com

apache/incubator-tvm/blob/master/tests/python/frontend/pytorch/lstm_test.py#L232-L235


if isinstance(exec_res, tvm.runtime.container.ADT):
    assert not isinstance(pt_result, torch.Tensor)
    tvm_res = vmobj_to_list(exec_res)
    torch_res = flatten(pt_result)

masahi · July 24, 2020, 12:29am

Also, if you want to support returning a list, you can modify _should_construct_dynamic_list function (from which you got the error above) below.

github.com

apache/incubator-tvm/blob/master/python/tvm/relay/frontend/pytorch.py#L80


def _convert_to_tensor_array(adt_lst, prelude):
    if prelude.length(adt_lst) == 0:
        return prelude.nil()

    checked_type = _infer_type_with_prelude(prelude.hd(adt_lst), prelude)
    shape = checked_type.shape
    tensor_array = _map_tensor_array_constructor(adt_lst, prelude, shape)
    return tensor_array, tuple(shape)


def _should_construct_dynamic_list(list_construct_node):
    # if this list is element-accessed or modified at runtime, generate List ADT
    def is_used_by_list_add(uses):
        for use in uses:
            op_name = use.user.kind()
            output_type = _get_node_type(use.user)
            if op_name in ["aten::add", "aten::add_"] and output_type == "ListType":
                return True
        return False

    def inplace_add_to_add(op_name):

For example, if this ListConstruct node is consumed by prim::Return, we should return True from this function. Then you’d get a List VM object as output.

vamshidhar.dantu · July 24, 2020, 12:40am

This is helpful . In the mean time, I tried tuple with two inputs. I get the following error. Just curious to know if this is right?

  File "tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 216, in __call__
    raise get_last_ffi_error()
  [bt] (8) 9   libtvm.dylib                        0x000000012f92fc44 std::__1::__function::__func<tvm::$_5, std::__1::allocator<tvm::$_5>, void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::operator()(tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&) + 84
  [bt] (7) 8   libtvm.dylib                        0x000000012f92fda7 void std::__1::__invoke_void_return_wrapper<void>::__call<tvm::$_5&, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*>(tvm::$_5&&&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&) + 167
  [bt] (6) 7   libtvm.dylib                        0x000000012f92fe62 tvm::$_5::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const + 178
  [bt] (5) 6   libtvm.dylib                        0x000000012f921c68 tvm::GenericFunc::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const + 1944
  [bt] (4) 5   libtvm.dylib                        0x000000012f0c0dd5 tvm::runtime::PackedFunc::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const + 85
  [bt] (3) 4   libtvm.dylib                        0x000000012f0c1cfb std::__1::function<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const + 155
  [bt] (2) 3   libtvm.dylib                        0x0000000130743149 std::__1::__function::__func<TVMFuncCreateFromCFunc::$_2, std::__1::allocator<TVMFuncCreateFromCFunc::$_2>, void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::operator()(tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&) + 73
  [bt] (1) 2   libtvm.dylib                        0x00000001307434b7 void std::__1::__invoke_void_return_wrapper<void>::__call<TVMFuncCreateFromCFunc::$_2&, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*>(TVMFuncCreateFromCFunc::$_2&&&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&) + 167
  [bt] (0) 1   libtvm.dylib                        0x0000000130743588 TVMFuncCreateFromCFunc::$_2::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const + 200
  File "tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 78, in cfun
    rv = local_pyfunc(*pyargs)
  File "tvm/python/tvm/relay/op/strategy/x86.py", line 236, in dense_strategy_cpu
    m, _ = inputs[0].shape
ValueError: too many values to unpack (expected 2)

masahi · July 24, 2020, 12:53am

The error seems to be coming from during relay.build, which means PyTorch -> Relay conversion itself didn’t raise any error. So yeah, it seems to be working.