Unused input tensors in LSTM C code

SebastianBoblestETAS · April 20, 2021, 8:16am

I create C code for a minimal network with a single lstm layer with the following code:

import tvm
import onnx
from tvm import relay
import json

model = onnx.load("lstm.onnx")
mod, params = relay.frontend.from_onnx(model, {"lstm_input": (1, 2, 3)})
c_target = "c --link-params"

with tvm.transform.PassContext(
        config={"tir.disable_vectorize": True}):
    module = relay.build(mod, target=c_target, params=params, target_host=c_target)

c_source = module.lib.get_source()
graph = module.get_json()

graph_dict = json.loads(graph)

The resulting C code has several functions that do not use all of their input tensors. For example

TVM_DLL int32_t fused_squeeze_3(void* args, void* arg_type_ids, int32_t num_args, void* out_ret_value, void* out_ret_tcode, void* resource_handle) {
  void* arg0 = (((TVMValue*)args)[0].v_handle);
  int32_t arg0_code = ((int32_t*)arg_type_ids)[(0)];
  void* arg1 = (((TVMValue*)args)[1].v_handle);
  int32_t arg1_code = ((int32_t*)arg_type_ids)[(1)];
  void* arg2 = (((TVMValue*)args)[2].v_handle);
  int32_t arg2_code = ((int32_t*)arg_type_ids)[(2)];
  void* placeholder = (((DLTensor*)arg0)[0].data);
  void* arg0_shape = (((DLTensor*)arg0)[0].shape);
  void* arg0_strides = (((DLTensor*)arg0)[0].strides);
  int32_t dev_id = (((DLTensor*)arg0)[0].device.device_id);
  void* placeholder1 = (((DLTensor*)arg1)[0].data);
  void* arg1_shape = (((DLTensor*)arg1)[0].shape);
  void* arg1_strides = (((DLTensor*)arg1)[0].strides);
  void* T_squeeze = (((DLTensor*)arg2)[0].data);
  void* arg2_shape = (((DLTensor*)arg2)[0].shape);
  void* arg2_strides = (((DLTensor*)arg2)[0].strides);
  if (!(arg0_strides == NULL)) {
  }
  if (!(arg1_strides == NULL)) {
  }
  if (!(arg2_strides == NULL)) {
  }
  for (int32_t ax1_inner = 0; ax1_inner < 3; ++ax1_inner) {
    ((float*)T_squeeze)[(ax1_inner)] = ((float*)placeholder1)[(ax1_inner)];
  }
  return 0;
}

where placeholder is unused. The corresponding part of the graph.json reads

  {
      "op": "tvm_op", 
      "name": "fused_squeeze_3", 
      "attrs": {
        "num_outputs": "1", 
        "num_inputs": "2", 
        "flatten_data": "0", 
        "func_name": "fused_squeeze_3"
      }, 
      "inputs": [
        [
          1, 
          0, 
          0
        ], 
        [
          1, 
          1, 
          0
        ]
      ]
    },

So placeholder and placeholder1 get the two output tensors of the preceding layer, but only one of them is used. What is the reason for this?

cron · April 22, 2021, 12:34pm

Hi @SebastianBoblestETAS,

I can sadly not help you with your problem, but I notice that the function you posted as C code is of a squeeze function. I dont want to hijack your thread, but this made me ask myself

Why would a squeeze function need to be implemented? especially as a form of copy operations.

Squeeze is just to “simplify” the dimensionality of a tensor so the dimensions with ==1 are obviated. This does not really require a series of copy operations.

Changing the “shape” attribute of the DLTensor would suffice wouldn’t it?

Can anyone also address this question?