Partial application of a runtime.Module

Hi,

I have a function that will return an implementation of a composite using TE as a runtime module. Imagine that the block is something of the like

%0 = nn.conv2d(%x, meta[relay.Constant] #weights, ...)
%1 = nn.bias_add(%0, meta[relay.Constant] #biases ...)

and I have generated a custom implementation of this using TE. Now in this implementation, I set up placeholders for all the variables. For example, I have

weights = te.placeholder(weights_shape, name="weights")

which I then use in a TE expression implementing that composite block.

I end up with a list like this

te_list = [input, weights, bias, result]

that I can lower to a runtime module by doing

schedule = te.create_schedule(result.op)
runtime_mod = tvm.build(schedule, te_list, target="llvm")

Now my problem is the following. If I want to use the runtime_mod, I need to pass all the inputs including the weights and biases.

output = runtime_mod(data_arr, weights_arr, bias_arr)

However, I already know the value of the weights and bias at compile-time, as shown in the pseudo-relay previously. What is the best way then to handle binding these constant weights and bias to the runtime module? Is there such concept of partial application of a runtime module where I would do something like the following?

def wrapped_runtime_mod(runtime_mod)
  weights_vals = const_arr
  bias_vals = const_arr
  return runtime_mod(weights=weights_vals, bias=bias_vals)

such that I could then do

output = wrapped_runtime_mod(runtime_mod)(data_arr)
1 Like

TE doesn’t have the concept of “constant tensor”, so we cannot bind a constant to a place holder.

There is a pass in TIR that does such binding https://github.com/apache/tvm/blob/8146a9bf2c9288103c4af834a5f14f13b50aa3c8/src/tir/transforms/bind_params.cc#L87, in principle you can use that on a prim func that’s lowered from your TE schedule. But this pass is not meant to be used this way, and you probably need some hack to get it to work.

1 Like

Thanks for your reply. What I am trying now is modifying the Relay expression to replace the constant function parameters by relay variables, so that the number of arguments of the relay function matches the number of te.Tensor variables in the TE compute function.

I think that the FuseOps pass does this normally, and I also noticed that the ARM ETHOS-U project does something similar here.