[Bug Report][Unit Test Failed] Failed to run the tensorflow frontend test_forward_ssd unit test

Leslie-Fang · July 27, 2020, 1:58am

TVM 0.7.1[36a0bf94cf93c5d4b067ae4359b8807ae2dde2d2]

Failed to run the tf ssd unit test in test_forward.py [https://github.com/apache/incubator-tvm/blob/959cff1c786e0eb33b99007be66de61d2275d7a5/tests/python/frontend/tensorflow/test_forward.py#L3943]

python test_forward.py

The test will pass if I use the default vm mode [https://github.com/apache/incubator-tvm/blob/959cff1c786e0eb33b99007be66de61d2275d7a5/tests/python/frontend/tensorflow/test_forward.py#L2418]

However, if I change the mode to graph_runtime, the relay.build will failed with error message:

tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) /home/tvm/tvm/build/libtvm.so(tvm::relay::StorageAllocaInit::VisitExpr_(tvm::relay::CallNode const*)+0x94) [0x7fd83c6a88a4]
  [bt] (7) /home/tvm/tvm/build/libtvm.so(tvm::relay::ExprVisitor::VisitExpr(tvm::RelayExpr const&)+0x7b) [0x7fd83c72ba3b]
  [bt] (6) /home/tvm/tvm/build/libtvm.so(tvm::relay::ExprFunctor<void (tvm::RelayExpr const&)>::VisitExpr(tvm::RelayExpr const&)+0x5b) [0x7fd83c6ed65b]
  [bt] (5) /home/tvm/tvm/build/libtvm.so(tvm::relay::StorageAllocaInit::VisitExpr_(tvm::relay::CallNode const*)+0x94) [0x7fd83c6a88a4]
  [bt] (4) /home/tvm/tvm/build/libtvm.so(tvm::relay::ExprVisitor::VisitExpr(tvm::RelayExpr const&)+0x7b) [0x7fd83c72ba3b]
  [bt] (3) /home/tvm/tvm/build/libtvm.so(tvm::relay::ExprFunctor<void (tvm::RelayExpr const&)>::VisitExpr(tvm::RelayExpr const&)+0x5b) [0x7fd83c6ed65b]
  [bt] (2) /home/tvm/tvm/build/libtvm.so(tvm::relay::StorageAllocaInit::VisitExpr_(tvm::relay::CallNode const*)+0x1e) [0x7fd83c6a882e]
  [bt] (1) /home/tvm/tvm/build/libtvm.so(tvm::relay::StorageAllocaInit::CreateToken(tvm::RelayExprNode const*, bool)+0x61e) [0x7fd83c6a816e]
  [bt] (0) /home/tvm/tvm/build/libtvm.so(+0x268a567) [0x7fd83c6a6567]
  File "/home/tvm/tvm/src/relay/backend/graph_plan_memory.cc", line 160
TVMError: Check failed: ttype:

Leslie-Fang · July 28, 2020, 9:55am

@kevinthesun Hi Kevin, I see in the discussion: https://github.com/apache/incubator-tvm/issues/4845

You point out the support of TF object detection model. Could you give me any hint about this issue?

Leslie-Fang · July 28, 2020, 2:07pm

Do some init debug, it seems in this case, the failed node is with type: TypeCallNode(GlobalTypeVar(static_tensor_float32_100_4_t, 5), [])

But in this CreateToken function https://github.com/apache/incubator-tvm/blob/4d0fa8b591c448482b94745d1cfe485bb7d54526/src/relay/backend/graph_plan_memory.cc#L144-L167, it will convert the type into TupleTypeNode or TensorTypeNode. Then in this case, the ttype will be None.

@tqchen Hi Tianqi, it seems PlanMemory is submitted by you in this PR:https://github.com/apache/incubator-tvm/pull/2120

Could you kindly help to give some advice?

kevinthesun · July 28, 2020, 5:40pm

Graph runtime can’t support TF OD models. You need to use VM.

Ruinhuang · September 8, 2020, 9:47am

Hi,

I met “Check failed” error before. And fixed it by using ‘VM’. But now I’m facing another error:

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

My test script is as follow:

    import tvm
    from tvm import relay
    import numpy as np
    import tensorflow as tf
    from tvm.relay.frontend.tensorflow_parser import TFParser
    import tvm.relay.testing.tf as tf_testing
    from tvm.runtime.vm import VirtualMachine
    
    # download from https://zenodo.org/record/3345892/files/tf_ssd_resnet34_22.1.zip?download=1
    model_path = '/home/ww/models/ssd/tf_ssd_resnet34_22.1/resnet34_tf.22.1.pb'
    target = tvm.target.cuda()
    ctx = tvm.gpu(0)
    
    with tf.gfile.GFile(model_path, 'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        graph = tf.import_graph_def(graph_def, name='')
        graph_def = tf_testing.ProcessGraphDefParam(graph_def)
    
    mod, params = relay.frontend.from_tensorflow(graph_def, layout='NCHW', shape=(1, 3, 1200, 1200))
    
    print('Build...')
    disabled_pass = ["FoldScaleAxis"]
    
    with tvm.transform.PassContext(opt_level=3, disabled_pass=disabled_pass):
        vm_exec = relay.vm.compile(mod, target="llvm", params=params)
    vm = VirtualMachine(vm_exec)
    
    data = np.random.uniform(0.0, 255.0, size=(1, 3, 1200, 1200))
    result = vm.invoke("main", key="image", value=data, **params)

Could you help me solve this problem?