Error in TVMC run: AttributeError: <class 'tvm.runtime.container.ADT'> has no attribute numpy

Hello,

I have converted an ONNX model into TVM relay format using TVMC:

python -m tvm.driver.tvmc compile --target "llvm" --input-shapes "image:[1, 1, 256, 192] rotation_normalized_to_world:[1, 3, 3] principal_point_normalized:[1, 2] focal_length_normalized:[1, 2]" --output models/gaze_mode_prune_quant.tar models/model_prune_quant.onnx 

WARNING:autotvm:One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.

I had to use “vm” as executor.

But when I try to run inference using the model with some input data in npz format I get this error:

tvmc run --inputs models/gazenet_input_tvm.npz --output models/predictions.npz models/gaze_mode_prune_quant.tar 
2023-07-12 11:11:39.116 INFO load_module /tmp/tmp3j72vong/lib.so
Traceback (most recent call last):
  File "/home/jv1941/workspace/tvm/tvm_python_venv/bin/tvmc", line 33, in <module>
    sys.exit(load_entry_point('tvm==0.13.dev296+g588d1f2e9', 'console_scripts', 'tvmc')())
  File "/home/jv1941/workspace/tvm/python/tvm/driver/tvmc/main.py", line 118, in main
    sys.exit(_main(sys.argv[1:]))
  File "/home/jv1941/workspace/tvm/python/tvm/driver/tvmc/main.py", line 106, in _main
    return args.func(args)
  File "/home/jv1941/workspace/tvm/python/tvm/driver/tvmc/runner.py", line 282, in drive_run
    result = run_module(
  File "/home/jv1941/workspace/tvm/python/tvm/driver/tvmc/runner.py", line 648, in run_module
    outputs[output_name] = val.numpy()
  File "/home/jv1941/workspace/tvm/python/tvm/runtime/object.py", line 75, in __getattr__
    raise AttributeError(f"{type(self)} has no attribute {name}") from None AttributeError: <class 'tvm.runtime.container.ADT'> has no attribute numpy

It seems when it is trying to convert the inference output into numpy format it fails. Any idea why that is happening?

/Johan

I have the same problem

The reason behind “why it fails” I’m not sure, but wanted to just add context that tvmc will try to convert the output to numpy as a way to serialise that into a file, using numpy.savez — NumPy v1.26 Manual as exchange format.

The fundamental reason why this happens is because the ML model itself becomes more diverse in terms of input/output. So in this case the model have a tuple output, or nested tuple.

Putting into a broader context, the input/output types of ML models becomes richer. A single flow with assumption (that input output sits in a more restricted form) would work for certain cases, but not all cases. For example, LLM inference would require us to pass in tuples of KV cache object, and run that in a loop that interacts with other components.

That is why Universal Deployment Runtime and open interpolated and integrated approach becomes much more important, see reference below

For a model that do have richer input/output. The best course course of action is to enable such universal deployment runtime interface and enable a python API to build and runtime apis(in languages of interest) to interact with the result, that allows us to handle a richer set of input/outputs in runtime.