[TVM CustomOP] NaN issue with TVM on inferring ONNX models

Krishna · August 31, 2023, 5:45am

Hello, I am working on a personal project where I have a neural network with a custom made softmax layer. The frontend is a Torch model. I converted it to ONNX to compile it with TVM.

Since TVM does not have a native implementation for my custom operator, I registered the operator with relay by following the TVM documentation and everything went fine. I ran the unittest and it passed the test cases.

When I ran an inference script of the onnx model with TVM, I got the following as output.

UserWarning: No opset import for domain 'MyDomain'

==> Context: Bad node spec for node. Name: /Softmax OpType: CustomSoftmax
  warnings.warn(str(e))
One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.
Tensor(shape=[1, 10], op.name=p0)
infer_custom_onnx_model.py:49: DeprecationWarning: legacy graph executor behavior of producing json / lib / params will be removed in the next release. Please see documents of tvm.contrib.graph_executor.GraphModule for the  new recommended usage.
  graph, lib, tvm_params = relay.build(mod,target, params = params)
[[nan nan nan nan nan nan nan nan nan nan]]

I am using ONNX v7 and TVM v0.13.0. I have attached the inference script as well.

INFERENCE SCRIPT:

`import tvm`

from tvm import relay

from tvm.contrib import graph_runtime, graph_executor

import numpy as np

import onnx

# ONNX import
onnx_model_path = 'custom_softmax_model.onnx'

onnx_model = onnx.load(onnx_model_path)

# Setting target
target = "llvm"

tvm_ctx = tvm.cpu()

# Compilation 
mod, params = relay.frontend.from_onnx(onnx_model)

with tvm.transform.PassContext(opt_level=3):

    graph, lib, tvm_params = relay.build(mod,target, params = params)

tvm_module = graph_executor.create(graph, lib, tvm_ctx)

tvm_module.load_params(tvm_params)

input_shape = (1,3,32,32)

input_data = np.random.rand(*input_shape).astype(np.float32)

tvm_input_data = tvm.nd.array(input_data)

tvm_module.set_input('input.1', tvm_input_data)

# Running inference
tvm_module.run()

output_tensor = tvm_module.get_output(0)

output = output_tensor.asnumpy()

print(output)

Can someone please help? I dont understand why I am getting NaN’s returned.

@jwfromm @junrushao @thierry @alopez_13