Hi,
I have an ONNX model that I’m trying to run on C, like the bundle deploy example - the static option. However, the results differ between Python and C. (Python outputs the correct results)
Following the example, I generate the graph, lib and params in Python, then load them in C. When running extremely simple models that only do slicing, it works. However, when running a bit more complex model, the results differ greatly between Python and C. In addition, when changing opt_level to 0, the output in C changes, but is still wrong.
Generating graph, lib, params:
mod, params = relay.frontend.from_onnx(onnx_model, shape_dict) with tvm.transform.PassContext(opt_level=3, config={"tir.disable_vectorize": True}): target = tvm.target.Target("llvm --runtime=c --system-lib") g_json, mmod, params = relay.build(mod, target=target, params=params) # save artifacts bin_params = tvm.runtime.save_param_dict(params) lib_file_name = os.path.join(build_dir, file_format_str.format(name=model_name, ext="tar")) mmod.export_library(lib_file_name) with open( os.path.join(build_dir, file_format_str.format(name=model_name + "_graph", ext="json")), "w" ) as f_graph_json: f_graph_json.write(g_json) with open( os.path.join(build_dir, file_format_str.format(name=model_name + "_params", ext="bin")), "wb" ) as f_params: f_params.write(bin_params)
Loading in C:
char* json_data = (char*)(build_dvt_graph_c_json); char* params_data = (char*)(build_dvt_params_c_bin); uint64_t params_size = build_dvt_params_c_bin_len;
// more input and output config here, basically the same as in the example except for sizes
void* handle = tvm_runtime_create(json_data, params_data, params_size, argv[0]); tvm_runtime_set_input(handle, “0”, &input); tvm_runtime_run(handle); tvm_runtime_get_output(handle, 0, &output); tvm_runtime_destroy(handle);
I used the debugger and this is the log for Python:
[17:10:44] /home/sapire/git/tvm_r/src/runtime/graph_executor/debug/graph_executor_debug.cc:103: Iteration: 0 [17:10:44] /home/sapire/git/tvm_r/src/runtime/graph_executor/debug/graph_executor_debug.cc:108: Op #0 fused_strided_slice: 0.198826 us/iter [17:10:44] /home/sapire/git/tvm_r/src/runtime/graph_executor/debug/graph_executor_debug.cc:108: Op #1 fused_mean: 0.908446 us/iter [17:10:44] /home/sapire/git/tvm_r/src/runtime/graph_executor/debug/graph_executor_debug.cc:108: Op #2 fused_strided_slice_1: 0.355717 us/iter [17:10:44] /home/sapire/git/tvm_r/src/runtime/graph_executor/debug/graph_executor_debug.cc:108: Op #3 fused_mean_1: 1.0394 us/iter [17:10:44] /home/sapire/git/tvm_r/src/runtime/graph_executor/debug/graph_executor_debug.cc:108: Op #4 fused_strided_slice_2: 0.805902 us/iter [17:10:44] /home/sapire/git/tvm_r/src/runtime/graph_executor/debug/graph_executor_debug.cc:108: Op #5 fused_mean_2: 7.23777 us/iter [17:10:44] /home/sapire/git/tvm_r/src/runtime/graph_executor/debug/graph_executor_debug.cc:108: Op #6 fused_take_concatenate_strided_slice_reshape_squeeze_subtract_strided_slice_abs: 0.108174 us/iter [17:10:44] /home/sapire/git/tvm_r/src/runtime/graph_executor/debug/graph_executor_debug.cc:108: Op #7 fused_sum: 0.0805326 us/iter [17:10:44] /home/sapire/git/tvm_r/src/runtime/graph_executor/debug/graph_executor_debug.cc:108: Op #8 fused_multiply_divide: 0.0795978 us/iter
When looking at the output of Op #1 fused_mean, there’s already a difference between Python and C. (I also turned on TVM_CRT_DEBUG, the same operators are called in C.)
Any help with this issue would be much appreciated.