Onnx frontend giving an error when working a simple model with TVM

jmatai1 · December 22, 2020, 9:54pm

Hi All.

I am having a problem importing a simple lenet model (MNIST digit classification) with ONNX front end. I am using MNIST digit classification from: https://github.com/onnx/models/blob/master/vision/classification/mnist/model/mnist-1.tar.gz

Here is the snippet of code that I am using. ` def lenet_onnx_model(onnxfile, inputfile, outputfile, dotfile):

# load ONNX model
onnxmodel =  onnx.load_model(onnxfile)
input = numpy_helper.to_array(onnx.load_tensor(inputfile))
output = numpy_helper.to_array(onnx.load_tensor(outputfile))

input_name = onnxmodel.graph.input[0].name
shape_dict = {input_name: input.shape}

# convert ONNX model to IR module. THIS LINE gives an error.
mod, params = relay.frontend.from_onnx(onnxmodel, shape_dict)

`

Error when I run “mod, params = relay.frontend.from_onnx(onnxmodel, shape_dict)”. The errror is as follows:

Any suggestions on how to solve this problem?

’ Traceback (most recent call last): File “C:\Program Files\JetBrains\PyCharm 2020.1.2\plugins\python\helpers\pydev_pydevd_bundle\pydevd_exec2.py”, line 3, in Exec exec(exp, global_vars, local_vars) File “”, line 1, in File “C:\repos\tvm23\tvm\python\tvm\relay\frontend\onnx.py”, line 2748, in from_onnx mod, params = g.from_onnx(graph, opset, freeze_params) File “C:\repos\tvm23\tvm\python\tvm\relay\frontend\onnx.py”, line 2555, in from_onnx op = self._convert_operator(op_name, inputs, attr, opset) File “C:\repos\tvm23\tvm\python\tvm\relay\frontend\onnx.py”, line 2663, in _convert_operator sym = convert_map[op_name](inputs, attrs, self._params) File “C:\repos\tvm23\tvm\python\tvm\relay\frontend\onnx.py”, line 268, in _impl_v1 input_shape = infer_shape(data) File “C:\repos\tvm23\tvm\python\tvm\relay\frontend\common.py”, line 501, in infer_shape out_type = infer_type(inputs, mod=mod) File “C:\repos\tvm23\tvm\python\tvm\relay\frontend\common.py”, line 482, in infer_type new_mod = _transform.InferType()(new_mod) File “C:\repos\tvm23\tvm\python\tvm\ir\transform.py”, line 127, in call return _ffi_transform_api.RunPass(self, mod) File “C:\repos\tvm23\tvm\python\tvm_ffi_ctypes\packed_func.py”, line 237, in call raise get_last_ffi_error() tvm._ffi.base.TVMError: Traceback (most recent call last): File “C:\repos\tvm23\tvm\src\relay\analysis\type_solver.cc”, line 622 TVMError:

An internal invariant was violated during the execution of TVM. Please read TVM’s error reporting guidelines. More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.

Check failed: false == false: [13:50:39] C:\repos\tvm23\tvm\src\relay\op\type_relations.cc:107:

An internal invariant was violated during the execution of TVM. Please read TVM’s error reporting guidelines. More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.

Check failed: t0->dtype == t1->dtype (int64 vs. int32) :

’

tkonolige · December 22, 2020, 10:41pm

Looks like you are trying to use arrays of different types. Could you try switching your numpy arrays from int64 to int32?

mbrookhart · December 22, 2020, 11:51pm

This looks like a possible bug in the importer. During import, you’re running type inference to determine the shape of data coming into a Pool op, but when you do that type inference, it finds somewhere in your model, a place where you are attempting to to do an elementwise broadcast style operation on two inputs with different datatypes. I’m wondering if some op higher up in your graph has a hardcoded datatype that doesn’t match the ONNX file.

mbrookhart · December 22, 2020, 11:54pm

I can’t get it to fail without the input file/output file stuff, can you share with input files you’re using?

import onnx
from tvm import relay

onnxfile="mnist/model.onnx"
# load ONNX model
onnxmodel =  onnx.load_model(onnxfile)

mod, params = relay.frontend.from_onnx(onnxmodel)
print("hi!")

jmatai1 · December 22, 2020, 11:58pm

Hi @mbrookhart,

The input onnx files and tests are here: https://github.com/onnx/models/blob/master/vision/classification/mnist/model/mnist-1.tar.gz

Thank you very much.

mbrookhart · December 22, 2020, 11:58pm

This also passes:

import onnx
from onnx import numpy_helper
from tvm import relay

onnxfile="mnist/model.onnx"
# load ONNX model
onnxmodel =  onnx.load_model(onnxfile)

input_tensor = numpy_helper.to_array(onnx.load_tensor("mnist/test_data_set_0/input_0.pb"))

input_name = onnxmodel.graph.input[0].name
shape_dict = {input_name: input_tensor.shape}

# convert ONNX model to IR module. THIS LINE gives an error.
mod, params = relay.frontend.from_onnx(onnxmodel, shape_dict)
print(mod)

What version of TVM are you using?

jmatai1 · December 23, 2020, 12:06am

@mbrookhart - I just tried your example, and I got the same error.

My TVM is fairly recent (I updated my TVM on Dec 01, 2020). I believe the specific commit is this one: https://github.com/apache/tvm/commit/fe4c66b2e50035ab2701923d6a2cd0cb82e63780

Should I update my TVM to the latest version?

mbrookhart · December 23, 2020, 2:32am

Can you give it a try? There may have been a bug fix, I was on today’s main. If it still fails there, maybe we have a configuration difference.

jmatai1 · January 13, 2021, 8:40pm

Hi @mbrookhart,

I finally got a chance to work on this again. I am still having the same problem. I have the latest TVM version just compiled for this problem.

Here is my setup.

OS: Windows 64 bit machine TVM: The latest version (as of yesterday). Specific commit: https://github.com/apache/tvm/commit/e3b2984ac2e5274a1d3fe1cdcb86d5dffe04066b

Here is the code: The same code as above this thread. I tried to repost it here, but formatting did not worked.

Any suggestions will be really appreciated.

mbrookhart · January 13, 2021, 10:21pm

Hmm. The code you originally posted isn’t complete, I can’t run it. I just tested this commit on the script I posted 22 days ago, and it still passes for me.

I’m on Ubuntu 20.04, my only change from the default cmake config is to enable CUDA. Perhaps it’s an issue with Windows? I don’t have a Windows machine handy to test on.

@jmatai1 Any chance you can post a more complete version of the script you’re calling?

@rkimball any interest in trying to test this on Windows?

jmatai1 · January 13, 2021, 10:33pm

Here is the complete code. BTW, it is a Windows problem. I just setup a linux machine and tested the same code. It works. So it seems it is a Windows problem. Not sure how to fix it.

import onnx
from onnx import numpy_helper
from tvm import relay

onnxfile="mnist/model.onnx"
# load ONNX model
onnxmodel =  onnx.load_model(onnxfile)

input_tensor = numpy_helper.to_array(onnx.load_tensor("mnist/test_data_set_0/input_0.pb"))

input_name = onnxmodel.graph.input[0].name
shape_dict = {input_name: input_tensor.shape}

# convert ONNX model to IR module. THIS LINE gives an error.
#dtype = 'float32'
mod, params = relay.frontend.from_onnx(onnxmodel, shape_dict)
print(mod)

rkimball · January 14, 2021, 4:48pm

@jmatai1 after a bit of struggle to get all the dependencies installed on windows I am able to reproduce your error. The script works fine on linux and I get your same error on windows. Will see if we can figure out what is going on.

rkimball · January 14, 2021, 5:38pm

@jmatai1 I tracked the issue down to _op.const not respecting the requested datatype on Windows. An example is in this line in tvm\relay\frontend\onnx.py in the function autopad strides = _op.const(np.array(strides), dtype="int64") The datatype should be int64 for strides but it ends up being int32. A quick and dirty workaround is to add .astype("int64") to the 3 instances of np.array in the function autopad. I made a branch on my fork if you want to see the complete workaround there: https://github.com/apache/tvm/compare/main...rkimball:bob/op_const_type_hack?expand=1 This is not a fix, just a workaround. A proper fix is to get op.const to honor the requested dtype passed in. I will debug that now but I did want to pass on a quick workaround.

rkimball · January 14, 2021, 7:33pm

https://github.com/apache/tvm/pull/7285 should address the issue

jmatai1 · January 15, 2021, 5:18am

Hi @rkimball, thanks a lot.

Is this integrated with main branch? I do not see the commit in the main branch. Once it is in main, I will test it.

rkimball · January 15, 2021, 4:43pm

Not in main this morning. Hopefully today it can merge. If you want you can try it from my branch, or if no hurry then wait for the merge. https://github.com/rkimball/tvm/tree/bob/const_dtype

rkimball · January 16, 2021, 4:08am

@jmatai1 Merged to main so go ahead and give it a try.

jmatai1 · January 19, 2021, 3:43am

Thanks, @rkimball and it works perfectly!