How params are converted and stored in lib.get_params()

Lyken17 · March 24, 2022, 4:15pm

Assume we have an expression constructed as follows

w_bit = 32
x = relay.var("input", shape=[7], dtype=f"int{w_bit}")
w = relay.var("x2", shape=[2,], dtype=f"int{w_bit}")
zx = relay.var("x3", shape=[3, ], dtype=f"int{w_bit}")
zy = relay.var("x4", shape=[4, ], dtype=f"int{w_bit}")
e_scale = relay.var("x5", shape=[5,], dtype=f"int{w_bit}")
th = relay.var("x6", shape=[6, ], dtype=f"int{w_bit}")

yy = relay.concatenate([zx, zy], axis=-1)
out = yy + x

fn = relay.Function([x, w, zx, zy, e_scale, th], out)
mod = tvm.IRModule.from_expr(fn)
mod = relay.transform.InferType()(mod)
mod['main']
'''
fn (%input: Tensor[(7), int32], %x2: Tensor[(2), int32], %x3: Tensor[(3), int32], %x4: Tensor[(4), int32], %x5: Tensor[(5), int32], %x6: Tensor[(6), int32]) -> Tensor[(7), int32] {
  %0 = (%x3, %x4);
  %1 = concatenate(%0, axis=-1) /* ty=Tensor[(7), int32] */;
  add(%1, %input) /* ty=Tensor[(7), int32] */
}
'''

When loaded in GraphExecutor, it is expected to feed 6 tensors (one input and five weights). However, after compiling, lib.get_params only return one parameters. Is there any implicit folding happens during the build?

vs = relay.analysis.all_vars(mod["main"])
# [Var(v1_input, ty=TensorType([1, 3, 80, 80], int32)), Var(v2_weight, ty=TensorType([16, 3, 3, 3], int32)), Var(v3_zero_x, ty=TensorType([1], int32)), Var(v4_zero_y, ty=TensorType([1], int32)), Var(v5_effective_scale, ty=TensorType([1, 16, 1, 1], int32)), Var(v6_truncate_threashold, ty=TensorType([1], int32))]
tp = dict()
for idx, v in enumerate(vs):
    if "input" not in str(v.name_hint):
        shape = [int(_) for _ in v.type_annotation.shape]
        p = np.ones(shape).astype(np.int32) * idx
        tp[str(v.name_hint)] = tvm.nd.array(p)
lib = relay.build(mod['main'], target="llvm", params=tp)
[(k, lib.get_params()[k].shape) for k in sorted(lib.get_params())]
# [('p0', (7,))]

ganler · March 24, 2022, 10:19pm

This is because relay.build optimizes the constants through constant folding. That said, those constants you marked have been folded and simplified.

Actually, it is simplified in the bind_params_by_name function in relay.build. And if you print the output module of that you see:

def @main(%input: Tensor[(7), int32]) -> Tensor[(7), int32] {
  %0 = (meta[relay.Constant][0], meta[relay.Constant][1]);
  %1 = concatenate(%0, axis=-1) /* ty=Tensor[(7), int32] */;
  add(%1, %input) /* ty=Tensor[(7), int32] */
}

You may also want to read the comments:

github.com

apache/tvm/blob/main/python/tvm/relay/build_module.py#L569


        tophub_context = autotvm.tophub.context(list(target.values()))
    else:
        tophub_context = autotvm.utils.EmptyContext()


    with tophub_context:
        bld_mod = BuildModule()
        mod, params = bld_mod.optimize(mod, target, params)
    return mod, params




def bind_params_by_name(func, params):
    """Bind params to function by name.
    This could be useful when assembling custom Relay optimization
    passes that involve constant folding.


    Parameters
    ----------
    func : relay.Function
        The function to bind parameters to.


    params : dict of str to NDArray

Lyken17 · March 25, 2022, 12:19am

@Ganler Thanks for the pointer. I see. I actually notice this function but thought bind_params_by_name was only for binding without constant folding.

Lyken17 · March 25, 2022, 1:08am

@Ganler Just re-ran above code example and notice that the args passed in tvm.IRModule rather than Function, thus the bind_params_by_name will not be called when building the model

github.com

apache/tvm/blob/main/python/tvm/relay/build_module.py#L432-L434


if isinstance(ir_mod, _function.Function):
    if params:
        ir_mod = bind_params_by_name(ir_mod, params)

Seems the folding is performed somewhere else? How do you obtain the Folded IR?

ganler · March 25, 2022, 1:36am

@Lyken17 I think your code is passing the function (mod[‘main’]) instead of the module (mod) to relay.optimize, though the arguments will still be binded if you use module (https://github.com/apache/tvm/blob/main/python/tvm/relay/build_module.py#L188).

Using:

import tvm
import tvm.relay as relay
import numpy as np

w_bit = 32
x = relay.var("input", shape=[7], dtype=f"int{w_bit}")
w = relay.var("x2", shape=[2,], dtype=f"int{w_bit}")
zx = relay.var("x3", shape=[3, ], dtype=f"int{w_bit}")
zy = relay.var("x4", shape=[4, ], dtype=f"int{w_bit}")
e_scale = relay.var("x5", shape=[5,], dtype=f"int{w_bit}")
th = relay.var("x6", shape=[6, ], dtype=f"int{w_bit}")

yy = relay.concatenate([zx, zy], axis=-1)
out = yy + x

fn = relay.Function([x, w, zx, zy, e_scale, th], out)
mod = tvm.IRModule.from_expr(fn)
mod = relay.transform.InferType()(mod)

vs = relay.analysis.all_vars(mod["main"])
# [Var(v1_input, ty=TensorType([1, 3, 80, 80], int32)), Var(v2_weight, ty=TensorType([16, 3, 3, 3], int32)), Var(v3_zero_x, ty=TensorType([1], int32)), Var(v4_zero_y, ty=TensorType([1], int32)), Var(v5_effective_scale, ty=TensorType([1, 16, 1, 1], int32)), Var(v6_truncate_threashold, ty=TensorType([1], int32))]
tp = dict()
for idx, v in enumerate(vs):
    if "input" not in str(v.name_hint):
        shape = [int(_) for _ in v.type_annotation.shape]
        p = np.ones(shape).astype(np.int32) * idx
        tp[str(v.name_hint)] = tvm.nd.array(p)
lib = relay.build(mod['main'], target="llvm", params=tp)

with a print statement print(ir_mod) before and after: https://github.com/apache/tvm/blob/main/python/tvm/relay/build_module.py#L434

I got (commit tag: 678e76b3efd57b171940f0017bee89451e381785):

fn (%input: Tensor[(7), int32], %x2: Tensor[(2), int32], %x3: Tensor[(3), int32], %x4: Tensor[(4), int32], %x5: Tensor[(5), int32], %x6: Tensor[(6), int32]) -> Tensor[(7), int32] {
  %0 = (%x3, %x4);
  %1 = concatenate(%0, axis=-1) /* ty=Tensor[(7), int32] */;
  add(%1, %input) /* ty=Tensor[(7), int32] */
}
fn (%input: Tensor[(7), int32]) -> Tensor[(7), int32] {
  %0 = (meta[relay.Constant][0], meta[relay.Constant][1]);
  %1 = concatenate(%0, axis=-1) /* ty=Tensor[(7), int32] */;
  add(%1, %input) /* ty=Tensor[(7), int32] */
}

Also I should not call it “constant folding” technically but “constant binding” as the constants are only removed from the arguments to the meta field instead of being “folded”. Also note that

Lyken17 · March 25, 2022, 3:31pm

@Ganler

But if we change the mod[‘main’] to mod, then the if-statement is not entered and the binding is not performed

lib = relay.build(mod, target="llvm", params=tp)
print(len(lib.get_params().keys()))
# 1

And I added print(mod) before and after https://github.com/apache/tvm/blob/main/python/tvm/relay/build_module.py#L188. I am curious whether the constant binding performed? Is there any approach to disable it?

# Before 
def @main(%input: Tensor[(7), int32], %x2: Tensor[(2), int32], %x3: Tensor[(3), int32], %x4: Tensor[(4), int32], %x5: Tensor[(5), int32], %x6: Tensor[(6), int32]) -> Tensor[(7), int32] {
  %0 = (%x3, %x4);
  %1 = concatenate(%0, axis=-1) /* ty=Tensor[(7), int32] */;
  add(%1, %input) /* ty=Tensor[(7), int32] */
}
# After
def @main(%input: Tensor[(7), int32], %x2: Tensor[(2), int32], %x3: Tensor[(3), int32], %x4: Tensor[(4), int32], %x5: Tensor[(5), int32], %x6: Tensor[(6), int32]) -> Tensor[(7), int32] {
  %0 = (%x3, %x4);
  %1 = concatenate(%0, axis=-1) /* ty=Tensor[(7), int32] */;
  add(%1, %input) /* ty=Tensor[(7), int32] */
}